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RPS2 GENE AND USES THEREOF 
Background of the Invention 
5 The invention relates to recombinant plant nucleic 

acids and polypeptides and uses thereof to confer disease 
resistance to pathogens in transgenic plants. 

Plants employ a variety of defensive strategies to 
combat pathogens. One defense response, the so-called 
10 hypersensitive response (HR) , involves rapid localized 
necrosis of infected tissue. In several host-pathogen 
interactions, genetic analysis has revealed a gene-for- 
gene correspondence between a particular avirulence (avr) 
gene in an avirulent pathogen that elicits an HR in a 
15 host possessing a particular resistance gene. 

Summary of the Invention 
In general, the invention features substantially 
pure DNA (for example, genomic DNA, cDNA or synthetic 
DNA) encoding an Rps polypeptide as defined below. In 

2 0 related aspects, the invention also features a vector, a 

cell (e.g., a plant cell), and a transgenic plant or seed 
thereof which includes such a substantially pure DNA 
encoding an Rps polypeptide. 

In preferred embodiments, an RPS gene [SEQ. ID 
25 NO: 5] is the RPS2 gene of a plant of the genus 

Arabidopsis . In various preferred embodiments, the cell 
is a transformed plant cell derived from a cell of a 
transgenic plant. In related aspects, the invention 
features a transgenic plant containing a transgene which 

3 0 encodes an Rps polypeptide that is expressed in plant 

tissue susceptible to infection by pathogens expressing 
the avrRpt2 avirulence gene [SEQ. ID. NO: 105] or 
pathogens expressing an avirulence signal similarly 
recognized by an Rps polypeptide. 
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In a second aspect, the invention features a 
substantially pure DNA which includes a promoter capable 
of expressing the RPS2 gene [SEQ. ID. N0:1] in plant 
tissue susceptible to infection by bacterial pathogens 
5 expressing the avrRpt2 avirulence gene [SEQ. ID NO: 105]. 

In preferred embodiments, the promoter is the 
promoter native to an RPS gene. Additionally, 
transcriptional and translational regulatory regions are 
preferably native to an RPS gene. 

10 The transgenic plants of the invention are 

preferably plants which are susceptible to infection by a 
pathogen expressing an avirulence gene, preferably the 
avrRpt2 avirulence gene [SEQ ID. NO: 105]. In preferred 
embodiments the transgenic plant is from the group of 

15 plants consisting of but not limited to Arabidopsls , 
tomato, soybean, bean, maize, wheat and rice. 

In another aspect, the invention features a method 
of providing resistance in a plant to a pathogen which 
involves: (a) producing a transgenic plant cell having a 

20 transgene encoding ah Rps2 polypeptide wherein the 

transgene is integrated into the genome of the transgenic 
plant and is positioned for expression in the plant cell; 
and (b) growing a transgenic plant from the transgenic 
plant cell wherein the RPS2 transgene is expressed in the 

2 5 transgenic plant. 

In another aspect, the invention features a method 
of detecting a resistance gene in a plant cell involving: 
(a) contacting the RPS2 gene [SEQ ID N0:1] or a portion 
thereof greater than 18 nucleic acids in length with a 

3 0 preparation of genomic DNA from said plant cell under 

hybridization conditions providing detection of DNA 
sequences having about 50% or greater sequence identity 
to the DNA sequence of Fig. 2 encoding the Rps2 
polypeptide [SEQ. ID NOS:2-5]. 
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In another aspect, the invention features a method 
of producing an Rps2 polypeptide which involves: (a) 
providing a cell transformed with DNA encoding an Rps2 
polypeptide positioned for expression in the cell; (b) 
5 culturing the transformed cell under conditions for 
expressing the DNA; and (c) isolating the Rps2 
polypeptide. 

In another aspect, the invention features 
substantially pure Rps2 polypeptide. Preferably, the 

10 polypeptide includes a greater than 50 amino acid 

sequence substantially identical to a greater than 50 
amino acid sequence shown in Fig. 2, open reading frame 
"a". Most preferably, the polypeptide is the Arabidopsis 
thaliana Rps2 polypeptide [SEQ. ID N0S:2-5]. 

15 In another aspect, the invention features a method 

of providing resistance in a transgenic plant to 
infection by pathogens which do not carry the a\rrRpt2 
avirulence gene wherein the method includes: (a) 
producing a transgenic plant cell having transgenes 

20 encoding an Rps2 polypeptide as well as a transgene 
encoding the avirRpt2 gene product [SEQ ID. NO: 106] 
wherein the transgenes are integrated into the genome of 
the transgenic plant; are positioned for expression in 
the plant cell; and the avrRpt2 transgene and, if 

2 5 desired, the RPS2 gene, are under the control of 

regulatory sequences suitable for controlled expression 
of the gene(s); and (b) growing a transgenic plant from 
the transgenic plant cell wherein the RPS2 and avrRpt2 
transgenes are expressed in the transgenic plant. 

3 0 In another aspect, the invention features a method 

of providing resistance in a transgenic plant to 
infection by pathogens in the absence of avirulence gene 
expression in the pathogen wherein the method involves: 
(a) producing a transgenic plant cell having integrated 
3 5 in the genome a transgene containing the RPS2 gene under 
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the control of a promoter providing constitutive 
expression of the RPS2 gene; and (b) growing a transgenic 
plant from the transgenic plant cell wherein the RPS2 
transgene is expressed constitutively in the transgenic 
5 plant. 

In another aspect, the invention features a method 
of providing controllable resistance in a transgenic 
plant to infection by pathogens in the absence of 
avirulence gene expression in the pathogen wherein the 

10 method involves: (a) producing a transgenic plant cell 
having integrated in the genome a "transgene containing 
the RPS2 gene under the control of a promoter providing 
controllable expression of the RPS2 gene; and (b) growing 
a transgenic plant from the transgenic plant cell wherein 

15 the RPS2 transgene is controllably expressed in the 
transgenic plant. In preferred embodiments, the RPS2 
gene is expressed using a tissue-specific or cell type- 
specific promoter, or by a promoter that is activated by 
the introduction of an external signal or agent, such as 

20 a chemical signal or agent. 

By "disease resistance gene" is meant a gene 
encoding a polypeptide capable of triggering the plant 
defense response in a plant cell or plant tissue. An RPS 
gene is a disease resistance gene having about 50% or 

25 greater sequence identity to the RPS2 sequence [SEQ ID. 

NO:l] of Fig. 2 or a portion thereof. The gene, RPS2 , is 
a disease resistance gene encoding the Rps2 disease 
resistance polypeptide [SEQ. ID NOS:2-5] from Arabidopsls 
thaliana . 

30 By "polypeptide" is meant any chain of amino 

acids, regardless of length or post-translational 
modification (e.g., glycosylation or phosphorylation). 

By "substantially identical" is meant a 
polypeptide or nucleic acid exhibiting at least 50%, 

35 preferably 85%, more preferably 90%, and most preferably 
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95% homology to a reference amino acid or nucleic acid 
sequence. For polypeptides, the length of comparison 
sequences will generally be at least 16 amino acids, 
preferably at least 2 0 amino acids, more preferably at 
5 least 25 amino acids, and most preferably 35 amino acids. 
For nucleic acids, the length of comparison sequences 
will generally be at least 50 nucleotides, preferably at 
least 60 nucleotides, more preferably at least 75 
nucleotides, and most preferably 110 nucleotides. 

10 Sequence identity is typically measured using 

sequence analysis software (e.g.. Sequence Analysis 
Software Package of the Genetics Computer Group, 
University of Wisconsin Biotechnology Center, 1710 
University Avenue, Madison, WI 53705) . Such software 

15 matches similar sequences by assigning degrees of 
homology to various substitutions, deletions, 
substitutions, and other modifications. Conservative 
substitutions typically include substitutions within the 
following groups: glycine alanine; valine, isoleucine, 

20 leucine; aspartic acid, glutamic acid, asparagine, 
glutamine; serine, threonine; lysine, arginine; and 
phenylalanine, tyrosine. 

By a "substantially pure polypeptide" is meant an 
Rps2 polypeptide which has been separated from components 

25 which naturally accompany it. Typically, the polypeptide 
is substantially pure when it is at least 60%, by weight, 
free from the proteins and naturally-occurring organic 
molecules with which it is naturally associated. 
Preferably, the preparation is at least 75%, more 

3 0 preferably at least 90%, and most preferably at least 
99%, by weight, Rps2 polypeptide. A substantially pure 
Rps2 polypeptide may be obtained, for example, by 
extraction from a natural source (e.g., a plant cell); by 
expression of a recombinant nucleic acid encoding an Rps2 

35 polypeptide; or by chemically synthesizing the protein. 



wo 95/28478 




PCT/US95/04570 



- 6 - 

Purity can be measured by any appropriate method, e.g., 
those described in column chromatography, polyacrylamide 
gel electrophoresis, or by HPLC analysis, 

A protein is substantially free of naturally 
5 associated components when it is separated from those 
contaminants which accompany it in its natural state. 
Thus, a protein which is chemically synthesized or 
produced in a cellular system different from the cell 
from which it naturally originates will be substantially 

10 free from its naturally associated components. 

Accordingly, substantially pure polypeptides include 
those derived from eukaryotic organisms but synthesized 
in coll or other prokaryotes . 

By "substantially pure DNA" is meant DNA that is 

15 free of the genes which, in the naturally-occurring 
genome of the organism from which the DNA of the 
invention is derived, flank the gene* The term therefore 
includes, for example, a recombinant DNA which is 
incorporated into a vector; into an autonomously 

2 0 replicating plasmid or virus; or into the genomic DNA of 

a prokaryote or eukaryote; or which exists as a separate 
molecule (e.g., a cDNA or a genomic or cDNA fragment 
produced by PCR or restriction endonuclease digestion) 
independent of other sequences. It also includes a 
25 recombinant DNA which is part of a hybrid gene encoding 
additional polypeptide sequence. 

By "transformed cell" is meant a cell into which 
(or into an ancestor of which) has been introduced, by 

■ 

means of recombinant DNA techniques, a DNA molecule 

3 0 encoding (as used herein) an Rps2 polypeptide. 

By "positioned for expression" is meant that the 
DNA molecule is positioned adjacent to a DNA sequence 
which directs transcription and translation of the 
sequence (i.e., facilitates the production of, e.g., an 
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Rps2 polypeptide, a recombinant protein or a RNA 
molecule) . 

By "reporter gene" is meant a gene whose 
expression may be assayed; such genes include, without 
5 limitation, /9-glucuronidase (GUS) , luciferase, 
chloramphenicol transacetylase (CAT) , and 6- 
galactosidase , 

By "promoter" is meant minimal sequence sufficient 
to direct transcription. Also included in the invention 

10 are those promoter elements which are sufficient to 

render promoter-dependent gene expression controllable 
for cell-type specific, tissue-specific or inducible by 
external signals or agents; such elements may be located 
in the 5' or 3' regions of the native gene. 

15 By "operably linked" is meant that a gene and a 

regulatory sequence (s) are connected in such a way as to 
permit gene expression when the appropriate molecules 
(e.g., transcriptional activator proteins) are bound to 
the regulatory sequence (s). 

20 By "plant cell" is meant any self-propagating cell 

bounded by a semi-permeable membrane and containing a 
plastid. Such a cell also requires a cell wall if further 
propagation is desired. Plant cell, as used herein 
includes, without limitation, algae, cyanobacteria, seeds 

25 suspension cultures, embryos, meristematic regions, 
callus tissue, leaves, roots, shoots, gametophytes , 
sporophytes, pollen, and microspores. 

By "transgene" is meant any piece of DNA which is 
inserted by artifice into a cell, and becomes part of the 

30 genome of the organism which develops from that cell. 
Such a transgene may include a gene which is partly or 
entirely heterologous (i.e., foreign) to the transgenic 
organism, or may represent a gene homologous to an 
endogenous gene of the organism . 
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By "transgenic" is meant any cell which includes a 
DNA sequence which is inserted by artifice into a cell 
and becomes part of the genome of the organism which 
develops from that cell. As used herein, the transgenic 
5 organisms are generally transgenic plants and the DNA 
(transgene) is inserted by artifice into the nuclear or 
plastidic genome. 

By "pathogen" is meant an organism whose infection 
into the cells of viable plant tissue elicits a disease 
10 response in the plant tissue. 

Other features and advantages of the invention 
will be apparent from the following description of the 
preferred embodiments thereof, and from the claims. 

Detailed Description 
15 The drawings will first be described. 

Drawings 

Figs. lA - IF are a schematic summary of the 
physical and RFLP analysis that led to the cloning of the 
RPS2 locus . 

2 0 Fig. lA is a diagram showing the alignment of the 

genetic and the RFLP maps of the relevant portion of 
Arabidopsis thaliana chromosome IV adapted from the map 
published by Lister and Dean (1993) Plant J. 4:745-750. 
The RFLP marker LllFll represents the left arm of the 
25 YUPllFll YAC clone. 

Fig. IB is a diagram showing the alignment of 
relevant YACs around the RPS2 locus. YAC constructs 
designated YUP16G5, YUP18G9 and YUPllFll were provided by 
J. Ecker, University of Pennsylvania. YAC constructs 

3 0 designated EW3H7, EW11D4, EW11E4, and EW9C3 were provided 

by E. Ward, Ciba-Geigy, Inc. 

Fig. IC is a diagram showing the alignment of 
cosmid clones around the RPS2 locus. Cosmid clones with 
the designation H are derivatives of the EW3H7 YAC clone 
3 5 whereas those with the designation E are derivatives of 
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the EW11E4 YAC clone. Vertical arrows indicate the 
relative positions of RFLP markers between the ecotypes 
La-er and the rps2-101N plant. The RFLP markers were 
identified by screening a Southern blot containing more 
5 than 50 different restriction enzyme digests using either 
the entire part or pieces of the corresponding cosmid 
clones as probes. The cosmid clones described in Fig. IC 
were provided by J. Giraudat> C.N.R.S., Gif-sur-Yvette , 
France. 

10 Figs. ID and IE are maps of EcoRX restriction 

endonuclease sites in the cosmids E4-4 and E4-6, 
respectively. The recombination break points surrounding 
the RPS2 locus are located within the 4.5 and 7.5 kb 
EcoRl restriction endonuclease fragments. 

15 Fig. IF is a diagram showing the approximate 

location of genes which encode the RNA transcripts which 
have been identified by polyA"^ RNA blot analysis. The 
sizes of the transcripts are given in kilobase pairs 
below each transcript. 

20 Fig. 2 is the complete nucleotide sequence of 

cDNA-4 comprising the .RPS2 [SEQ. ID. NO: 1] gene locus. 
The three reading frames are shown below the nucleotide 
sequence. The deduced amino acid sequence [SEQ. ID NOS:2- 
5] of reading frame "a" is provided and contains 909 

2 5 amino acids. The methionine encoded by the ATG start 
codon is circled in open reading frame "a" of Fig. 2. 
The A of the ATG start codon is nucleotide 31 of Fig. 2. 

Fig. 3 is the. nucleotide sequence of the avrRpt2 
gene [SEQ. ID NO: 105] and its deduced amino acid sequence 

30 [SEQ. ID N0:106]. A potential ribosome binding site is 
underlined. An inverted repeat is indicated by 
horizontal arrows at the 3' end of the open reading 
frame. The deduced amino acid sequence is provided below 
the nucleotide sequence of the open reading frame. 
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Fig. 4 is a schematic summary of the 
complementation analysis that allowed functional 
confirmation that the DNA carried on p4104 and p4115 
(encoding cDNA-4) confers RPS2 disease resistance 
5 activity to Arabidopsis thaliana plants previously 

lacking RPS2 disease resistance activity. Small vertical 
marks along the "genome" line represent restriction 
enzyme EcoRl recognition sites, and the numbers above 
this line represent the size, in kilobasepairs (kb) , of 

10 the resulting DNA fragments (see also Fig. IE) . Opposite 
"cDNAs" are the approximate locations of the coding 
sequences for RNA transcripts (See also Fig. IF) ; 
arrowheads indicate the direction of transcription for 
cDNAs 4, 5, and 6- For functional complementation 

15 experiments, rps2-201C/rps2-201C plants were genetically 
transformed with the Arabidopsis thaliana genomic DNA 
sequences indicated; these sequences were carried on the 
named plasmids (derivatives of the binary cosmid vector 
pSLJ4541) and delivered to the plant via Agrobacterium- 

20 mediated transformation methods. The disease resistance 
phenotype of the resulting transf ormants following 
inoculation with P. syringae expressing avrRpt2 is given 
as "Sus." (susceptible, no resistance response) or "Res." 
(disease resistant) . 

25 The Genetic Basis for Resistance to Pathogens 

An overview of the interaction between a plant 
host and a microbial pathogen is presented. The invasion 
of a plant by a potential pathogen can have a range of 
outcomes delineated by the following outcomes: either the 

30 pathogen successfully proliferates in the host, causing 
associated disease symptoms, or its growth is halted by 
the host defenses. In some plant-pathogen interactions, 
the visible hallmark of an active defense response is the 
so-called hypersensitive response or "HR" . The HR 
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involves rapid necrosis of cells near the site of the 
infection and may include the formation of a visible dry 
brown lesion. Pathogens which elicit an HR on a given 
host are said to be avirulent on that host, the host is 
5 said to be resistant . and the plant-pathogen interaction 
is said to be incompatible . Strains which proliferate 
and cause disease on a particular host are said to be 
virulent; in this case the host is said to be 
susceptible ^ and the plant-pathogen interaction is said 

10 to be compatible 

"Classical" genetic analysis has been used 
successfully to help elucidate the genetic basis of 
plant-pathogen recognition for those cases in which a 
series of strains (races) of a particular fungal or 

15 bacterial pathogen are either virulent or avirulent on a 
series of cultivars (or different wild accessions) of a 
particular host species. In many such cases, genetic 
analysis of both the host and the pathogen revealed that 
many avirulent fungal and bacterial strains differ from 

20 virulent ones by the possession of one or more avirulence 
(avr) genes that have corresponding "resistance" genes in 
the host. This avirulence gene-resistance gene 
correspondence is termed the "gene-f or-gene" model 
(Crute, et al., (1985) pp 197-309 in: Mechanisms of 

2 5 Resistance to Plant Disease. R.S.S. Fraser, ed. ; 

Ellingboe, (1981) Annu. Rev. Phytopathol. 19:125-143; 
Flor, (1971) Annu. Rev. Phytopathol. 9:275-296; Keen and 
Staskawicz, (1988) supra ; and Keen et al. in: Application 
of Biotechnology to Plant Pathogen Control. I. Chet, ed. , 

30 John Wiley & Sons, 1993, pp. 65-88). According to a 

simple formulation of this model, plant resistance genes 
encode specific receptors for molecular signals generated 
by avr genes. Signal transduction pathway (s) then carry 
the signal to a set of target genes that initiate the HR 

35 and other host defenses (Gabriel and Rolfe, (1990) Annu. 
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Rev. Phytopathol • 28:365-391). Despite this simple 
predictive model, the molecular basis of the avr- 
resistance gene interaction is still unknown. 

One basic prediction of the gene-f or-gene 
5 hypothesis has been convincingly confirmed at the 

molecular level by the cloning of a variety of bacterial 
etvr genes (Innes, et al. , (1993) J. Bacterid. 175:4859- 
4869; Dong, et al., (1991) Plant Cell 3:61-72; Whelan et 
al., (1991) Plant Cell 3:49-59; Staskawicz et al. , (1987) 

10 J. Bacterid. 169:5789-5794; Gabriel et al., (1986) 

P.N.A.S., USA 83:6415-6419; Keen and Staskawicz, (1988) 
Annu. Rev. Microbiol. 42:421-440; Kobayashi et al., 
(1990) Md. Plant-Microbe Interact. 3:94-102 and (1990) 
Mol. Plant-Microbe Interact. 3:103-111). Many of these 

15 cloned avirulence genes have been shown to correspond to 
individual resistance genes in the cognate host plants 
and have been shown to confer an a virulent phenotype when 
transferred to an otherwise virulent strain. The EL\rrRpt2 
locus was isolated from Pseudomonas syrlngaB pv. tomato 

20 and sequenced by Innes et al. (Innes, R. et al. (199 3) J. 
Bacterid. 175:4859-48^69). Fig. 3 is the nucleotide 
sequence [SEQ. ID NO: 105] and deduced amino acid sequence 
[SEQ. ID NO: 6] of the avrRpt2 gene. 

Examples of known signals to which plants respond 

25 when infected by pathogens include harpins from Eirwinia 
(Wei et al. (1992) Science 257:85-88) and Pseudomonas (He 
et al. (1993) Cell 73:1255-1266); avr4 (Joosten et al. 
(1994) Nature 367:384-386) and avr9 peptides (van den 
Ackerveken et al (1992) Plant J, 2:359-366) from 

30 Cladosporlum; PopAl from Pseudomonas (Arlat et al. (1994) 
EMBO J. 13:543-553); avrD-generated lipopolysaccharide 
(Midland et al. (1993) J. Org. Chem. 58:2940-2945); and 
NIPl from Rhynchosporlum (Hahn et al. (1993) Mol. Plant- 
Microbe Interact. 6:745-754) . 
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Compared to avr genes, considerably less is known 
about plant resistance genes that correspond to specific 
avjT-generated signal^. The plant resistance gene, RPS2 
(rps for resistance to Pseudomonas syringae) , the first 
5 gene of a new, previously unidentified class of plant 
disease resistance genes corresponds to a specific avr 
gene (avrRpt2) • Some of the work leading up to the 
cloning of RPS2 is described in Yu, et al., (1993), 
Molecular Plant -Microbe Interactions 6:434-443 and in 

10 Kunkel, et al, , (1993) Plant Cell 5:865-875, 

An apparently unrelated avirulence gene which 
corresponds specifically to plant disease resistance 
gene, Pto, has been isolated from tomato (LycopBrsicon 
esculentum) (Martin et al., (1993) Science 262:1432-1436). 

15 Tomato plants expressing the Pto gene are resistant to 
infection by strains of Pseudomonas syringae pv, tomato 
that express the avrPto avirulence gene. The amino acid 
sequence inferred from the Pto gene DNA sequence displays 
strong similarity to serine-threonine protein kinases, 

20 implicating Pto in signal transduction. No similarity to 
the tomato Pto locus or any known protein kinases was 
observed for RPS2 , suggesting that RPS2 is representative 
of a new class of plant disease resistance genes. 

The isolation of a race-specific resistance gene 

2 5 from Zea mays (corn) known as Hml has been reported 

(Johal and Briggs (1992) Science 258:985-987). Hml 
confers resistance against specific races of the fungal 
pathogen Cochliobolus carbonum by controlling degradation 
of a fungal toxin, a strategy that is mechanistically 

3 0 distinct from the avirulence-gene specific resistance of 

the RPS2-avrRpt2 resistance mechanism. 

The cloned RPS2 gene of the invention can be used 

* 

to facilitate the construction of plants that are 
resistant to specific pathogens and to overcome the 
35 inability to transfer disease resistance genes between 
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species using classical breeding techniques (Keen et al,, 
(1993), supra ) . There now follows a description of the 
cloning and characterization of an Arabidopsis thallana 
RPS2 genetic locus, the RPS2 genomic DNA, and the RPS2 
5 cDNA. The avrRpt2 gene and the RPS2 gene, as well as 
mutants rps2-101C , rps2-102C , and rps2-201C (also 
designated rps2-201) , are described in Dong, et al., 
(1991) Plant Cell 3:61-72; Y\i, et al., (1993) supra ; 
Kunkel et al*, (1993) supra ; Whalen et al, , (1991), 
10 supra ; and Innes et al., (1993), supra ) . A mutant 
designated rps2-101N has also been isolated. The 
identification and cloning of the RPS2 gene is described 
below. 

RPS2 Overcomes Sensitivity to Pathogens Carrying the 
15 avrRpt2 Gene . 

To demonstrate the genetic relationship between an 
avirulence gene in the pathogen and a resistance gene in 
the host, it was necessary first to isolate an avirulence 
gene. By screening Pseudojno/jas strains that are known 

2 0 pathogens of crop plants related to Arabidopsls , highly 

virulent strains, P. syringae pv. macullcola (Psm) 
ES4 326, P. syrxngae pv. tomato (Pst) DC3 000, and an 
avirulent strain, Pst MM1065 were identified and analyzed 
as to their respective abilities to grow in wild type 
25 Arabidopsis thallana plants (Dong et al . , (1991) Plant 
Cell, 3:61-72; Whalen et al., (1991) Plant Cell 3:49-59; 
MM1065 is designated JL1065 in Whalen et al.). Psm 
ES4326 or Pst DC3000 can multiply 10^ fold in Arabidopsis 
thaliana leaves and cause water-soaked lesions that 

3 0 appear over the course of two days. Pst MM1065 

multiplies a maximum of 10 fold in Arabidopsis thaliana 
leaves and causes the appearance of a mildly chlorotic 
dry lesion after 48 hours. Thus, disease resistance is 
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associated with severely inhibited growth of the 
pathogen . 

An avirulence gene (avr) of the Pst MM1065 strain 
was cloned using standard techniques as described in Dong 
5 et al. (1991), Plant Cell 3:61-72; Whalen et al., (1991) 
supra ; and Innes et al., (1993), supra > The isolated 
avirulence gene from this strain was designated a\rrRpt2 . 
Normally, the virulent strain Psm ES43 2 6 or Pst DC3000 
causes the appearance of disease symptoms after 4 8 hours 

10 as described above. In contrast, Psm ES432 6 /avrRpt 2 or 
Pst DC3 000 /avrRpt2 elicits the appearance of a visible 
necrotic hypersensitivity response (HR) within 16 hours 
and multiplies 50 fold less than Psm ES4326 or Pst DC3000 
in wild type Arabidopsls thallana leaves (Dong et al., 

15 (1991), supra ; and Whalen et al., (1991), supra ) . Thus, 
disease resistance in a wild type Arabidopsis plant 
requires, in part, an avirulence gene in the pathogen or 
a signal generated by the avirulence gene. 

The isolation of four Arabidopsis thallana disease 

2 0 resistance mutants has been described using the cloned 

avrRpt2 gene to search for the host gene required for 
disease resistance to pathogens carrying the avrRpt2 gene 
(Yu et al., (1993), supra; Kunkel et al., (1993), supra). 
The four Arabidopsis thallana mutants failed to develop 
25 an HR when infiltrated with Psm ES422 6 /avrRpt 2 or Pst 
DC3 000 /avrRpt2 as expected for plants having lost their 
disease resistance capacity. In the case of one of these 
mutants, approximately 3 000 five to six week old M2 
ecotype Columbia (Col-0 plants) plants generated by ethyl 

3 0 methanesulf onic acid (EMS) mutagenesis were hand- 

inoculated with Psm ES4 32 6/avr-Rpt2 and a single mutant, 
rps2-101C^ was identified (resistance to Pseudomonas 
gyringae) (Yu et al . , (1993), supra ) . 

The second mutant was isolated using a procedure 
3 5 that specifically enriches for mutants unable to mount an 
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HR (Yu et al . , (1993), supra ) . When 10-day old 
Arabidopsls thaliana seedlings growing on petri plates 
are infiltrated with Pseudomonas syringae pv. 
phasBolicola (Psp) NPS3121 versus Psp NPS3 121/ a vri?pt2 , 
5 about 90% of the plants infiltrated with Psp NPS3121 

survive, whereas about 90%-95% of the plants infiltrated 
with Psp NFS 3 121 /a vri?pt2 die. Apparently, vacuum 
infiltration of an entire small Arabidopsis thaliana 
seedling with Psp NPS3 12 l/avri?pt2 elicits a systemic HR 

10 which usually kills the seedling. In contrast, seedlings 
infiltrated with Psp NPS3121 survive because Psp NPS3121 
is a weak pathogen on Arabidopsis thaliana. The second 
disease resistance mutant was isolated by infiltrating 
4 000 EMS-mutagenized Columbia M2 seedlings with Psp 

15 NPS3 121/ a vri?pt2 . Two hundred survivors were obtained. 
These were transplanted to soil and re-screened by hand 
inoculation when the .plants reached maturity. Of these 
200 survivors, one plant failed to give an HR when hand- 
infiltrated with Psm ES4 32 6 /avrRpt 2 . This mutant was 

20 designated rps2-102C (Yu et al . , (1993), supra ) . 

A third mutant, rps2^201C , was isolated in a 
screen of approximately 7500 M2 plants derived from seed 
of Arabidopsis thaliana ecotype Col-O that had been 
mutagenized with diepoxybutane (Kunkel et al., (1993), 

25 supra ^ . Plants were inoculated by dipping entire leaf 
rosettes into a solution containing Pst DC3 000 /avrPpt 2 
bacteria and the surfactant Silwet L-77 (Whalen et al., 
(1991) , supra ) , incubating plants in a controlled 
environment growth chamber for three to four days, and 

3 0 then visually observing disease symptom development. 
This screen revealed four mutant lines (carrying the 
rps2-201C ^ rps2-202C^ rps2-203C , and rps2-204C alleles) , 
and plants homozygous for rps2-201C were a primary 
subject for further study (Kunkel et al., (1993), supra 

35 and the instant application) . 



wo 95/28478 




PCT/US95/04570 



- 17 - 

Isolation of the fourth rps2 mutant, rpsZ-lOlN , 
has not yet been published. This fourth isolate is 
either a mutant or a susceptible Arabidopsis ecotype. 
Seeds of the Arabidopsis Nossen ecotype were gamma- 
5 irradiated and then sown densely in flats and allowed to 
germinate and grow through a nylon mesh. When the plants 
were five to six weeHs old, the flats were inverted, the 
plants were partially submerged in a tray containing a 
culture of Psm '£34326 /avirRpt 2 , and the plants were vacuum 

10 infiltrated in a vacuum desiccator. Plants inoculated 
this way develop an HR within 24 hours. Using this 
procedure, approximately 4 0,000 plants were screened and 
one susceptible plant was identified. Subsequent RFLP 
analysis of this plant suggested that it may not be a 

15 Nossen mutant but rather a different Arabidopsis ecotype 
that is susceptible to Psm ES4326 /avrEpt 2 . This plant is 
referred to as rps2-101N . The isolated mutants rps2'' 
lOlC, rps2-102C , rps2-201C , and rps2''101N are referred to 
collectively as the ^*rps2 mutants". 

20 The rps2 Mutants Fail to Specifically Respond to the 
Cloned Avirulence Gene . avrRpt2 . 

The RPS2 gene product is specifically required for 
resistance to pathogens carrying the avirulence gene, 
avrRpt2 . A mutation in Rps2 polypeptide that eliminates 

25 or reduces its function would be observable as the 

absence of a hypersensitive response upon infiltration of 
the pathogen. The rps2 mutants displayed disease 
symptoms or a null response when infiltrated with Psjn 
ES4326/avri?pt2, Pst DC3 0 00/avrRpt2 or Psp 

30 NPS3 121 /a vr J?pt2 , respectively. Specifically, no HR 

response was elicited, indicating that the plants were 
susceptible and had lost resistance to the pathogen 
despite the presence of the avrRpt2 gene in the pathogen. 
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Pathogen growth in rps2 mutant plant leaves was 
similar in the presence and absence of the avrRpt2 gene. 
Psm ES4 32 6 and Psm ES4 32 6 /avrRpt2 growth in rps2 mutants 
was compared and found to multiply equally well in the 
5 rps2 mutants, at the same rate that Psm Es4 3 26 multiplied 
in wild-type Arabidopsis leaves. Similar results were 
observed for Pst DC3000 and Pst DC3 000 / avrRpt2 growth in 
rps2 mutants. 

The rps2 mutants displayed a HR when infiltrated 

10 with Pseudomonas pathogens carrying other avr genes, Psm 
ES4326/avrB, Pst DC3000/avrB, Psm ES^326 / avrRpml , Pst 
DC3000/ avr Rpml. The -ability to mount an HR to an avr 
gene other than avrRpt2 indicates that the rps2 mutants 
isolated by selection with avrRpt2 are specific to 

15 avrRpt2 . 

Mapping and Cloning of the RPS2 Gene . 

Genetic analysis of rps2 mutants rps2-101C^ rps2- 
102C^ rps~201C and rps-lOlN showed that they all 
corresponded to genes that segregated as expected for a 

20 single Mendelian locus and that all four were most likely 
allelic. The four rps2 mutants were mapped to the bottom 
of chromosome IV using standard RFLP mapping procedures 
including polymerase chain reaction (PGR) -based markers 
(Yu et al., (1993), supra ; Kunkel et al . , (1993), supra ; 

25 and Mindrinos, M. , unpublished). Segregation analysis 

showed that rps2-101C and rps2-102C are tightly linked to 
the PGR marker, PGll, while the RFLP marker M600 was used 
to define the chrompsome location of the rps2~201C 
mutation (Fig. lA) (Yu et al . , (1993), supra ; Kunkel et 

30 al . , (1993), supra ) . RPS2 has subsequently been mapped 
to the centromeric side of PGll. 

Heterozygous RPS2/rps2 plants display a defense 
response that is intermediate between those displayed by 
the wild-type and homozygous rps2 / rps2 mutant plants (Yu, 
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et al., (1993), supra : and Kunkel et al., (1993), supra ) > 
The heterozygous plants mounted an HR in response to Psm 
ES432S /avrRpt2 or Pst DC3 00 0/avri?pt2 infiltration; 
however, the HR appeared later than in wild type plants 
5 and required a higher minimum inoculum (Yu, et al», 
(1993), supra ; and Kunkel et al., (1993), supra ^ . 

High Resolution Mapping of the RPS2 Gene and RPS2 cDNA 
Isolation . ' 

To carry out map-based cloning of the RPS2 gene, 

10 rps2-101N/rps2-101N was crossed with Landsberg orecta 
RPS2/RPS2 . Plants of the generation were allowed to 
self pollinate (to "self") and 165 F2 plants were selfed 
to generate F3 families. Standard RFLP mapping 
procedures showed that rps2-101N maps close to and on the 

15 centromeric side of the RFLP marker, PGll. To obtain a 
more detailed map position, rps2~101N/rps~101N was 
crossed with a doubly marked Landsberg erecta strain 
containing the recessive mutations, cer2 and ap2. The 
genetic distance between cer2 and ap2 is approximately 15 

2 0 cM, and the rps2 locus is located within this interval. 
F2 plants that displayed either a CER2 ap2 or a cer2 AP2 
genotype were collected, selfed, and scored for RPS2 by 
inoculating at least 20 F3 plants for each F2 with Psm 
ES432 6/avrRpt2 . DNA was also prepared from a pool of 

25 approximately 2 0 F3 plants for each F2 line. The CER2 ap2 
and cer2 AP2 recombinants were used to carry out a 
chromosome walk that is illustrated in Figure 1. 

As shown in Figure 1, RPS2 was mapped to a 2 8-3 5 
kb region spanned by cosmid clones E4-4 and E4-6. This 

30 region contains at least six genes that produce 

detectable transcripts. There were no significant 
differences in the sizes of the transcripts or their 
level of expression in the rps2 mutants as determined by 
RNA blot analysis. cDNA clones of each of these 
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transcripts were isolated and five of these were 
sequenced. As is described below, one of these 
transcripts, cDNA-4, was shown to correspond to the RPS2 
locus. From this study, three independent cDNA clones 
5 (cDNA-4-4, cDNA-4-5, and cDNA-4-11) were obtained 

corresponding to RPS2 from Columbia ecotype wild type 
plants. The apparent sizes of RPS2 transcripts were 3.8 
and 3.1 kb as determined by RNA blot analysis. 

A fourth independent cDNA-4 clone {CDNA74-2453 ) 

10 was obtained using map-based isolation of RPS2 in a 

separate study. Yeast artificial chromosome (YAC) clones 
were identified that carry contiguous, overlapping 
inserts of Arabidops±s thaliana ecotype Col-O genomic DNA 
from the M600 region spanning approximately 900 kb in the 

15 RPS2 region. Arahidopsis YAC libraries were obtained 

from J. Ecker and E. Ward, supra and from E. Grill (Grill 
and Somerville (1991) Mol. Gen, Genet. 226:484-490). 
Cosmids designated "H" and "E" were derived from the YAC 
inserts and were used in the isolation of RPS2 (Fig. 1) . 

20 The genetic and physical location of RPS2 was more 

precisely defined using physically mapped RFLP, RAPD 
(random amplified polymorphic DNA) and CAPS (cleaved 
amplified polymorphic sequence) markers. Segregating 
populations from crosses between plants of genotype 

25 RPS2/RPS2 (No-O wild type) and rps2-201/rps2-201 (Col-O 
background) were used for genetic mapping. The RPS2 
locus was mapped using markers 17B7LE, PGll, M600 and 
other markers. For high-resolution genetic mapping, a 
set of tightly linked RFLP markers was generated using 

30 insert end fragments from YAC and cosmid clones (Fig. 1) 
(Kunkel et al. (1993), supra ; Konieczny and Ausubel 
(1993) Plant J. 4:403-410; and Chang et al. (1988) PNAS 
USA 85:6856-6860) . Cosmid clones E4-4 and E4-6 were then 
used to identify expressed transcripts (designated cDNA- 
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4, -5, -6, -7, -8 of Fig IF) from this region, including 
the cDNA-4-2453 clone. 

RPS2 DNA Sequence Analysis > 

DNA sequence analysis of cDNA-4 from wild-type 
5 Col-0 plants and from mutants rps2-101C , rps2~102C , rps2- 
2 QIC and rps2-'101N showed that cDNA-4 corresponds to 
RPS2 . DNA sequence analysis of rps2''101C , rps2-102C and 
rps2-201C revealed changes from the wild-type sequence as 
shown in Table 1. The numbering system in Table 1 starts 

10 at the ATG start codon encoding the first methionine 

where A is nucleotide 1. DNA sequence analysis of cDNA-4 
corresponding to mutant rps2-102C showed that it differed 
from the wild type sequence at amino acid residue 476. 
Moreover, DNA sequence analysis of the cDNA corresponding 

15 to cDNA-4 from rps2-101N showed that it contained a 10 bp 
insertion at amino acid residue 581, a site within the 
leucine-rich repeat region which causes a shift in the 
RPS2 reading frame. Mutant rps2~101C contains a mutation 
that leads to the formation of a chain termination codon. 

2 0 The DNA sequence of mutant allele rps2-201C revealed a 

mutation altering a single amino acid within a segment of 
the LRR region that also has similarity to the helix- 
loop-helix motif, further supporting the designation of 
this locus as the RPS2 gene. The DNA- and amino acid 

25 sequences are shown in Figure 2 [SEQ. ID NO:l and SEQ ID 
NOS:2-5, respectively]. 
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Table 1. 



Mutant 



rps2~101C 



Wild type 



703 TGA 705 



position of 



704 



Change 

mutation 

TAA Stop Codon 



5 rps2-101N 1741 GTG 1743 



rps2-102C 142 6 AGA 14 28 
476 

arg 

10 rps2~201C 2002 ACC 2004 

thr 



1741 



1427 



2002 



GTGGAGTTGTATG 

Insertion 
AAA Amino acid 
lys 

CCC Amino acid 
pro 



DNA sequence analysis of cDNA-4 corresponding to 
RPS2 from wild-type Col-O plants revealed an open reading 
frame (between two stop codons) spanning 2,751 bp. There 

15 are 2,727 bp between the first methionine codon of this 
reading frame and the 3 '-stop codon, which corresponds to 
a deduced 909 amino acid polypeptide (See open reading 
frame "a" of Fig. 2) - The amino acid sequence has a 
relative molecular weight of 104,460 and a pi of 6.51. 

20 RPS2 belongs to a new class of disease resistance 

genes; the structure of the Rps2 polypeptide does not 
resemble the protein structure of the product of the only 
previously cloned and publicized avirulence gene-specific 
plant disease resistance gene, Pto, which has a putative 

25 protein kinase domain. From the above analysis of the 
deduced amino acid sequence, RPS2 contains several 
distinct protein domains conserved in other proteins from 
both eukaryotes and prokaryotes . These domains include 
but are not limited to Leucine Rich Repeats (LRR) (Kobe 

30 and Deisenhofer, (1994) Nature 366:751-756) ; P-loop 

(Saraste et al. (1990) Trends in Biological Sciences TIBS 
15:430-434; Helix-Loop-Helix (Murre et al. (1989) Cell 
56:777-783; and Leucine Zipper (Rodrigues and Park (1993) 
Mol. Cell Biol. 13:6711-6722). The amino acid sequence 
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of Rps2 contains a LRR motif (LRR motif from amino acid 
residue 505 to amino acid residue 867) , which is present 
in many known proteins and which is thought to be 
involved in protein-protein interactions and may thus 
5 allow interaction with other proteins that are involved 
in plant disease resistance. The N-terminal portion of 
the Rps2 polypeptide liRR is, for example, related to the 
LRR of yeast {Saccharomyces cerBvisiae) adenylate 
cyclase, CYRl. A region predicted to be a transmembrane 

10 spanning domain (Klein et al. (1985) Biochim. , Biophys. 
Acta 815:468-476) is located from amino acid residue 350 
to amino acid residue 365, N-terminal to the LRR. An 
ATP/GTP binding site motif (P-loop) is predicted to be 
located between amino acid residue 177 and amino acid 

15 residue 194, inclusive. 

From the above analysis of the deduced amino acid 
sequence, the Rps2 polypeptide may have a membrane- 
receptor structure which consists of an N-terminal 
extracellular region and a C-terminal cytoplasmic region. 

20 Alternatively, the topology of the Rps2 may be the 
opposite: an N-terminal cytoplasmic region and a C- 
terminal extracellular region. ■ LRR motifs are 
extracellular in many cases and the Rps2 LRR contains 
five potential N-glycosylation sites. 

2 5 Identification of RPS2 by Functional Complementation . 

Complementation of rps2-201 homozygotes with 
genomic DNA corresponding to Arabidopsis thallana 
functionally confirmed that the genomic region encoding 
cDNA-4 carries RPS2 activity. Cosmids were constructed 

3 0 that contained overlapping contiguous sequences of wild 

type Arabidopsis thaliana DNA from the RPS2 region 
contained in YACs EW11D4, EW9C3 , and YUPllFl of Fig. 1 
and Fig. 4. The cosmid vectors were constructed from 
pSLJ'4 541 (obtained from J. Jones, Sainsbury Institute, 
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Norwich, England) which contains sequences that allow the 
inserted sequence to be integrated into the plant genome 
via AgroJbactejriujn-mediated transformation (designated 
"binary cosmid") . "H" and "E" cosmids (Fig. 1) were used 
5 to identify clones carrying DNA from the Arabidopsis 
thaliana. genomic RPS2 region. 

More than forty binary cosmids containing inserted 
RPS2 region DNA were used to transform rps2-201 
homozygous mutants utilizing Agrojbacteriujn-mediated 

10 transformation (Chang et al. ((1990) p. 28, Abstracts of 
the Fourth International Conference on Arabidopsis 
Research, Vienna, Austria) . Transf ormants which remained 
susceptible (determined by methods including the observed 
absence of an HR following infection to P. syringae pv. 

15 phaseolicola strain 3121 carrying avrRpt2 and Psp 3121 
without avrRpt2) indicated that the inserted DNA did not 
contain functional RPS2 • These cosmids conferred the 
"Sus." or susceptible phenotype indicated in Fig. 4. 
Transf ormants which had aquired avjri?pt2 -specif ic disease 

2 0 resistance (detemmined by methods including the display 

of a strong hypersensitive response (HR) when inoculated 
with Psp 3121 with avrRpt2 , but not following inoculation 
with Psp 3121 without avrRpt2) suggested that the 
inserted DNA contained a functional RPS2 gene capable of 

25 conferring the "Res." or resistant phenotype indicated in 
Fig. 4. Transf ormants obtained using the pD4 binary 
cosmid displayed a strong resistance phenotype as 
described above. The presence of the insert DNA in the 
transf ormants was confirmed by classical genetic analysis 

30 (the tight genetic linkage of the disease resistance 
phenotype and the kanamycin resistance phenotype 
conferred by the cotransf ormed selectable marker) and 
Southern analysis. These results indicated that RPS2 is 
encoded by a segment of the 18 kb Arabidopsis thaliana 

3 5 genomic region carried on cosmid pD4 (Fig. 4). 
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To further localize the RPS2 locus and confirm its 
ability to confer a resistance phenotype on the rps2-201 
homozygous mutants, a set of six binary cosmids 
containing partially overlapping genomic DNA inserts were 
5 tested. The overlapping inserts pD2 , pD4 , pD14, pD15, 
pD27, and pD47 were chosen based on the location of the 
transcription corresponding to the five cDNA clones in 
the RPS2 region (Fig. 4) . These transformation 
experiments utilized a vacuum infiltration procedure 

10 (Bechtold et al. (1993) C,R. Acad. Sci. Paris 316:1194- 
1199) for Agrojbacterium-mediated transformation, 
AgroJbacteriujn-mediated transformations with cosmids pD2, 
pD14, pD15, pD39, and pD46 were performed using a root 
transformation/regeneration protocol (Valveekens et al. 

15 (1988), PNAS 85:5536-5540). The results of pathogen 
inoculation experiments assaying for RPS2 activity in 
these transf ormants is indicated in Fig. 4. 

Additional transformation experiments utilized 
binary cosmids carrying the complete coding region and 

2 0 more than 1 kb of upstream genomic sequence for only 

cDNA-4 or cDNA-6. Using the vacuxim infiltration 
transformation method, three independent transf ormants 
were obtained that carried the wild-type cDNA-6 genomic 
region in a rps2~201c homozygous background (pAD431 of 
25 Fig. 4). None of these plants displayed avrRpt2~ 
dependent disease resistance. Homozygous rps2-201c 
mutants were transformed with wild-type genomic cDNA-4 
(p4104 and p4115, each carrying Col-0 genomic sequences 
corresponding to all of the cDNA-4 open reading frame, 

3 0 plus approximately 1.7 kb of 5' upstream sequence and 

approximately 0.3 kb 'of 3' sequence downstream of the 
stop codon) . These p4104 and p4115 transf ormants 
displayed a disease resistance phenotype similar to the 
wild-type RPS2 homozygotes from which the rps2 were 
35 derived. Additional mutants (rps2 -JOIN and rps2-101C 
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homozygotes) also displayed avrf?pt2 -dependent resistance 
when transformed with the cDNA-4 genomic region. 

RPS2 Sequences Allow Detection of Other Resistance Genes . 

DNA blot analysis of Arabidopsis thaliana genomic 
5 ^ DNA using RPS2 cDNA as the probe showed that Arabidopsis 
contains several DNA sequences that hybridize to RPS2 or 
a portion thereof, suggesting that there are several 
related genes in the Arabidopsis genome. 

From the aforementioned description and the 

10 nucleic acid sequence [SEQ. ID. N0:1] shown in Fig. 2, it 
is possible to isolate other plant disease resistance 
genes having about 50% or greater sequence identity to 
the RPS2 gene. Detection and isolation can be carried 
out with an oligonucleotide probe containing the RPS2 

15 gene or a portion thereof greater than about 18 nucleic 
acids in length. Probes to sequences encoding specific 
structural features of the Rps2 polypeptide [SEQ. ID 
NOS:2-5] are preferred as they provide a means of 
isolating disease resistance genes having similar 

20 structural domains. 'Hybridization can be done using 

standard techniques such as are described in Ausubel et 
. al.. Current Protocols in Molecular Biology, John Wiley & 
Sons, (1989) . 

For example, high stringency conditions for 

25 detecting the RPS2 gene include hybridization at about 
4 2**C, and about 50% formamide; a first wash at about 
65**C, about 2X SSC, and 1% SDS; followed by a second wash 
at about 65**C and about 0.1% SSC. Lower stringency 
conditions for detecting RPS genes having about 50% 

3 0 sequence identity to the RPS2 gene are detected by, for 
example, hybridization at about 4 2*'C in the absence of 
formamide; a first wash at about 42 °C, about 6X SSC, and 
about 1% SDS; and a second wash at about 50 ®C, about 6X 
SSC, and about 1% SDS. An approximately 3 50 nucleotide 
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DNA probe encoding the middle portion of the liRR region 
of Rps2 was used as a probe in the above example. Under 
lower stringency conditions, a minimum of 5 DNA bands 
were detected in BamHt digested Arabidopsis thaliana 
5 genomic DNA as sequences having sufficient sequence 

identity to hybridize to DNA encoding the middle portion 
of the LKR motif of Rps2 . Similar results were obtained 
using a probe containing a 300 nucleotide portion of the 
RPS2 gene encoding the extreme N-terminus of Rps2 outside 

10 of the LRR motif. 

Isolation of other disease resistance genes is 
performed by PGR amplification techniques well known to 
those skilled in the art of molecular biology using 
oligonucleotide primers designed to amplify only 

15 sequences flanked by the oligonucleotides in genes having 
sequence identity to RPS2 . The primers are optionally 
designed to allow cloning of the amplified product into a 
suitable vector. 

RPS2 Expression in Transgenic Plant Cells and Plants 
2 0 The expression of the RPS2 gene in plants 

susceptible to pathogens carrying avrRpt2 is achieved by 
introducing into a plant a DNA sequence containing the 
RPS2 gene for expression of the Rps2 polypeptide, A 
number of vectors suitable for stable transfection of 

■ 

2 5 plant cells or for the establishment of transgenic plants 
are available to the public; such vectors are described 
in, e.g., Pouwels et al.. Cloning Vectors: A Laboratory 
Manual, 1985, Supp. 1987); Weissbach and Weissbach, 
Methods for Plant Molecular Biology, Academic Press, 

30 1989; and Gelvin et al., Plant Molecular Biology Manual, 
Kluwer Academic Publishers, 1990. Typically, plant 
expression vectors include (1) one or more cloned plant 
genes under the transcriptional control of 5' and 3' 
regulatory sequences and (2) a dominant selectable 
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marker. Such plant expression vectors may also contain, 
if desired, a promoter regulatory region (e.g., a 
regulatory region controlling inducible or constitutive, 
environmentally- or developmentally-regulated, or cell- 
5 or tissue-specific expression) , a transcription 

initiation start site, a ribosome binding site, an RNA 
processing signal, a transcription termination site, 
and/or a polyadenylation signal. 

An example of a useful plant promoter which could 

10 be used to express a plant resistance gene according to 
the invention is a caul imovirus. promoter, e.g., the 
cauliflower mosaic virus (CaMV) 35S promoter. These 
promoters confer high levels of expression in most plant 
tissues, and the activity of these promoters is not 

15 dependent on virally encoded proteins. CaMV is a source 

■ 

for both the 35S and 19S promoters. In most tissues of 
transgenic plants, the CaMV 35S promoter is a strong 
promoter (see, e.g., Odel et al.. Nature 313:810, 
(1985)). The CaMV promoter is also highly active in 
20 monocots (see, e.g., Dekeyser et al.. Plant Cell 2:591, 
(1990); Terada and Shimamoto, Mol. Gen. Genet. 220:389, 
(1990) ) . 

Other useful plant promoters include, without 
limitation, the nopaline synthase promoter (An et al., 

25 Plant Physiol. 88:547, (1988)) and the octopine synthase 
promoter (Fromm et al.. Plant Cell 1:977, (1989)). 

For certain applications, it may be desirable to 
produce the RPS2 gene product or the avrRpt2 gene product 
in an appropriate tissue, at an appropriate level, or at 

3 0 an appropriate developmental time. Thus, there are a 
variety of gene promoters, each with its own distinct 
characteristics embodied in its regulatory sequences, 
shown to be regulated in response to the environment, 
hormones, and/or developmental cues. These include gene 

3 5 promoters that are responsible for (1) heat-regulated 
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gene expression (see, e.g., Callis et al,. Plant Physiol. 
88: 965, (1988)), (2) light-regulated gene expression 
(e.g., the pea rbcS-SA described by Kuhlemeier et al.. 
Plant Cell l: 471, (1989); the maize rbcS promoter 
5 described by Schaffner and Sheen, Plant Cell 3: 997, 

(1991) ; or the cholorphyll a/b-binding protein gene found 
in pea described by Simpson et al. , EMBO J. 4: 2723, 
(1985)), (3) hormone-regulated gene expression (e.g., the 
abscisic acid responsive sequences from the Em gene of 

10 wheat described Marcotte et al.. Plant Cell 1:969, 

(1989)), (4) wound-induced gene expression (e.g., of wnnl 
described by Siebertz et al., Plant Cell 1: 961, (1989)), 
or (5) organ-specific gene expression (e.g., of the 
tuber-specific storage protein gene described by Roshal 

15 et al., EMBO J. 6:1155, (1987); the 23-kDa zein gene from 
maize described by Schernthaner et al., EMBO J. 7: 1249, 
(1988); or the French bean B-phaseolin gene described by 
Bustos et al.. Plant Cell 1:839, (1989)). 

Plant expression vectors may also optionally 

20 include RNA processing signals, e.g, introns, which have 
been shown to be important for efficient RNA synthesis 
and accumulation (Callis et al.. Genes and Dev. 1: 1183, 
(1987)). The location of the RNA splice sequences can 
influence the level of transgene expression in plants. 

25 In view of this fact, an intron may be positioned 

upstream or downstream of an Rps2 polypeptide-encoding 
sequence in the transgene to modulate levels of gene 
expression. 

In addition to the aforementioned 5' regulatory 
3 0 control sequences, the expression vectors may also 

include regulatory control regions which are generally 
present in the 3 ' regions of plant genes (Thornburg et 
al., Proc. Natl Acad. Sci USA 84: 744, (1987); An et 
al.. Plant Cell 1: 115, (1989)). For example, the 3' 
3 5 terminator region may be included in the expression 
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vector to increase stability of the mRNA. One such 
terminator region may be derived from the PI-II 
terminator region of potato. In addition, other commonly 
used terminators are derived from the octopine or 
5 nopaline synthase signals* 

The plant expression vector also typically 
contains a dominant selectable marker gene used to 
identify the cells that have become transformed. Useful 
selectable marker genes for plant systems include genes 

10 encoding antibiotic resistance genes, for example, those 
encoding resistance to hygromycin, kanamycin, bleomycin, 
G418, streptomycin or spectinomycin. Genes required for 
photosynthesis may also be used as selectable markers in 
photosynthetic-def icient strains. Finally, genes 

15 encoding herbicide resistance may be used as selectable 
markers; useful herbicide resistance genes include the 
Jbar gene encoding the enzyme phosphinothricin 
acetyltransf erase, which confers resistance to the broad 
spectrum herbicide Basta® (Hoechst AG, Frankfurt, 

20 Germany). 

Efficient use of selectable markers is facilitated 
by a determination of the susceptibility of a plant cell 
to a particular selectable agent and a determination of 
the concentration of this agent which effectively kills 

25 most, if not all, of .the transformed cells. Some useful 
concentrations of antibiotics for tobacco transformation 
include, e.g. , 75-100 ^g/ml (kanamycin) , 20-50 Mg/ml 
(hygromycin) , or 5-10 /itg/ml (bleomycin) . A useful 
strategy for selection of transf ormants for herbicide 

30 resistance is described, e.g., in Vasil I.K. , Cell 

Culture and Somatic Cell Genetics of Plants, Vol I, II, 
III Laboratory Procedures and Their Applications Academic 
Press, New York, 1984.- 

It should be readily apparent to one skilled in 

35 the field of plant molecular biology that the level of 



wo 95/28478 




PCTAJS95/04570 



- 31 - 

gene expression is dependent not only on the combination 
of promoters, RNA processing signals and terminator 
elements, but also on how these elements are used to 
increase the levels of gene expression. 

5 Plant Transformation 

Upon construction of the plant expression vector, 
several standard methods are known for introduction of 
the recombinant genetic material into the host plant for 
the generation of a transgenic plant. These methods 

10 include (1) Agrobacterium-mediated transformation (A. 
tumefaciens or A. rhizogenes) (see, e.g., Lichtenstein 
and Fuller In: Genetic Engineering, vol 6, PWJ Rigby, ed, 
London, Academic Press, 1987; and Lichtenstein, CP., and 
Draper, J,. In: DNA Cloning, Vol II, D.M. Glover, ed, 

15 Oxford, IRI Press, 1985), (2) the particle delivery 

system (see, e.g., Gordon-Kamm et al.. Plant Cell 2:603, 
(1990); or BioRad Technical Bulletin 1687, supra), (3) 
microinjection protocols (see, e.g.. Green et al.. Plant 
Tissue and Cell Culture, Academic Press, New York, 1987), 

20 (4) polyethylene glycol (PEG) procedures (see, e.g.. 
Draper et al.. Plant Cell Physiol 23:451, (1982); or 
e.g., Zhang and Wu, Theor. Appl. Genet. 76:835, (1988)), 
(5) liposome-mediated DNA uptake (see, e.g.. Freeman et 
al.. Plant Cell Physiol 25: 1353, (1984)), (6) 

25 electroporation protocols (see, e.g., Gelvin et al supra ; 
Dekeyser et al. supra : or Fromm et al Nature 319: 791, 
(1986)), and (7) the vortexing method (see, e.g.. Kindle, 
K. , Proc. Natl. Acad. Sci., USA 87:1228, (1990)). 

The following* is an example outlining an 

3 0 AgroiDactejriujn-mediated plant transformation. The general 
process for manipulating genes to be transferred into the 
genome of plant cells is carried out in two phases. 
First, all the cloning and DNA modification steps are 
done in E. coli, and the plasmid containing the gene 
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construct of interest is transferred by conjugation into 
AgroJbacterium . Second, the resulting Agrobactorium 
strain is used to transform plant cells. Thus, for the 
generalized plant expression vector, the plasmid contains 
5 an origin of replication that allows it to replicate in 
AgroJbacteiriujn and a high copy number origin of 
replication functional in coli. This permits facile 
production and testing of transgenes in E*col± prior to 
transfer to AgrroJbacteriujn for subsequent introduction 

10 into plants. Resistance genes can be carried on the 
vector, one for selection in bacteria, e.g., 
streptomycin, and the other that will express in plants, 
e.g. , a gene encoding for kanamycin resistance or an 
herbicide resistance gene. Also present are restriction 

15 endonuclease sites for the addition of one or more 
transgenes operably linked to appropriate regulatory 
sequences and directional T-DNA border sequences which, 
when recognized by the transfer functions of 
AgroJbacteri uji?, delimit the region that will be 

20 transferred to the plant. 

In another example, plant cells may be transformed 
by shooting into the cell tungsten micropro jectiles on 
which cloned DNA is precipitated. In the Biolistic 
Apparatus (Bio-Rad, Hercules, CA) used for the shooting, 

25 a gunpowder charge (22 caliber Power Piston Tool Charge) 
or an air-driven blast drives a plastic macropro jectile 
through a gun barrel. An aliquot of a suspension of 
tungsten particles on which DNA has been precipitated is 
placed on the front of the plastic macroprojectile . The 

3 0 latter is fired at an acrylic stopping plate that has a 
hole through it that is too small for the macroprojectile 
to go through. As a result, the plastic macroprojectile 
smashes against the stopping plate and the tungsten 
microprojectiles continue toward their target through the 

35 hole in the plate. For the instant invention the target 
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can be any plant cell, tissue, seed, or embryo. The DNA 
introduced into the cell on the microprojectiles becomes 
integrated into either the nucleus or the chloroplast. 

Transfer and expression of transgenes in plant 
5 cells is now routine practice to those skilled in the 
art. It has become a major tool to carry out gene 
expression studies and to attempt to obtain improved 
plant varieties of agricultural or commerciaT interest. 

Transgenic Plant Regeneration 

10 Plant cells transformed with a plant expression 

vector can be regenerated, e.g., from single cells, 
callus tissue or leaf discs according to standard plant 
tissue culture techniques. It is well known in the art 
that various cells, tissues and organs from almost any 

15 plant can be successfully cultured to regenerate an 
entire plant; such techniques are described, e.g. , in 
Vasil supra ; Green et al. , supra ; Weissbach and 
Weissbach, supra ; and Gelvin et al., supra . 

In one possible example, a vector carrying a 

20 selectable marker gene (e.g., kanamycin resistance), a 
cloned RPS2 gene under the control of its own promoter 
and terminator or, if desired, under the control of 
exogenous regulatory sequences such as the 3 5S CaMV 
promoter and the nopal ine synthase terminator is 

2 5 transformed into Agrohactorlum. Transformation of leaf 
tissue with vector-containing Agrobacterium is carried 
out as described by Horsch et al. (Science 227: 1229, 
(1985)). Putative transf ormants are selected after a few 
weeks (e.g., 3 to 5 weeks) on plant tissue culture media 

30 containing kanamycin (e.g. lOO /ig/ml) . Kanamycin- 

resistant shoots are then placed on plant tissue culture 
media without hormones for root initiation. Kanamycin- 
resistant plants are then selected for greenhouse growth. 
If desired, seeds from self-fertilized transgenic plants 
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can then be sowed in a soil-less media and grown in a 
greenhouse. Kanamycin-resistant progeny are selected by 
sowing surfaced sterilized seeds on hormone-free 
kanamycin-containing media. Analysis for the integration 
5 of the transgene is accomplished by standard techniques 
(see, e.g., Ausubel et al. supra ; Gelvin et al. supra ) . 

Transgenic plants expressing the selectable marker 
are then screened for transmission of the transgene DNA 
by standard immunoblot and DNA and RNA detection 

10 techniques. Each positive transgenic plant and its 
transgenic progeny are unique in comparison to other 
transgenic plants established with the same transgene. 
Integration of the transgene DNA into the plant genomic 
DNA is in most cases random and the site of integration 

15 can profoundly effect the levels, and the tissue and 
developmental patterns of transgene expression. 
Consequently, a number of transgenic lines are usually 
screened for each transgene to identify and select plants 
with the most appropriate expression profiles. 

20 Transgenic lines are evaluated for levels of 

transgene expression. Expression at the RNA level is 
determined initially to identify and quantitate 
expression-positive plants. Standard techniques for RNA 
analysis are employed and include PGR amplification 

25 assays using oligonucleotide primers designed to amplify 
only transgene RNA templates and solution hybridization 
assays using transgene-specif ic probes (see, e.g., 
Ausubel et al., supra ) . The RNA-positive plants are then 
analyzed for protein expression by Western immunoblot 

■ 

30 analysis using Rps2 polypeptide-specif ic antibodies (see, 
e.g., Ausubel et al., supra ) . In addition, in situ 
hybridization and immunocytochemistry according to 
standard protocols can be done using transgene-specif ic 
nucleotide probes and antibodies, respectively, to 

35 localize sites of expression within transgenic tissue. 
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Once the Rps2 polypeptide has been expressed in 
any cell or in a transgenic plant (e^g., as described 
above) , it can be isolated using any standard technique, 
e.g., affinity chromatography. In one example, an anti-- 
5 Rps2 antibody (e.g., produced as described in Ausubel et 
al., supra , or by any standard technique) may be attached 
to a column and used to isolate the polypeptide. Lysis 
and fractionation of Rps 2 -producing cells prior to 
affinity chromatography may be performed by standard 

10 methods (see, e.g., Ausubel et al., supra ) . Once 

isolated, the recombinant polypeptide can, if desired, be 
further purified, e.g., by high performance liquid 
chromatography (see, e.g., Fisher, Laboratory Techniques 
In Biochemistry And Molecular Biology, Work and Burden, 

15 eds., Elsevier, 1980). 

These general techniques of polypeptide expression 
and purification can also be used to produce and isolate 
useful Rps2 fragments or analogs. 

Use 

2 0 Introduction of RPS2 into a transformed plant cell 

provides for resistance to bacterial pathogens carrying 
the avrRpt2 avirulence gene. For example, transgenic 
plants of the instant invention expressing RPS2 might be 
used to alter, simply and inexpensively, the disease 

25 resistance of plants normally susceptible to plant 
pathogens carrying the avirulence gene, avrRpt2 . 

The invention also provides for broad-spectrum 
pathogen resistance by mimicking the natural mechanism of 
host resistance. First, the RPS2 transgene is expressed 

30 in plant cells at a sufficiently high level to initiate 
the plant defense response constitutively in the absence 
of signals from the pathogen. The level of expression 
associated with plant; defense response initiation is 
determined by measuring the levels of defense response 
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gene expression as described in Dong et al., supra . 
Second, the RPS2 transgene is expressed by a controllable 
promoter such as a tissue-specific promoter, cell-type 
specific promoter or by a promoter that is induced by an 
5 external signal or agent thus limiting the temporal and 
tissue expression of a defense response. Finally, the 
RPS2 gene product is co-expressed with the avrRpt2 gene 
product. The RPS2 gene is expressed by its natural 
promoter, by a constitutively expressed promoter such as 

10 the CaMV 35S promoter, by a tissue-specific or cell-type 
specific promoter, or by a promoter that is activated by 
an external signal or agent. Co-expression of RPS2 and 
avrRpt2 will mimic the production of gene products 
associated with the initiation of the plant defense 

15 response and provide resistance to pathogens in the 
absence of specific resistance gene-avirulence gene 
corresponding pairs in the host plant and pathogen. 

The invention also provides for expression in 
plant cells of a nucleic acid having the sequence [SEQ. 

20 ID. NO:l] of Fig. 2 or the expression of a degenerate 

variant thereof encoding the amino acid sequence [SEQ. ID 
NOS:2-5] of open reading frame "a" of Fig. 2. 

The invention further provides for the isolation 
of nucleic acid sequences having about 50% or greater 

2 5 sequence identity to RPS2 by using the RPS2 sequence 

[SEQ. ID. NO:l] of Fig. 2 or a portion thereof greater 
than about 18 nucleic acids in length as a probe. 
Appropriate reduced hybridization stringency conditions 
are utilized to isolate DNA sequences having about 50% or 

3 0 greater sequence identity to the RPS2 sequence [SEQ. ID. 

NO: 1] of Fig. 2. 

The invention will provide disease resistance to 
plants, especially crop plants, most especially important 
crop plants such as tomato, pepper, maize, wheat, rice 
3 5 and legumes such as soybean and bean, or any plant which 
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is susceptible to pathogens carrying an avirulence gene, 
e.g., the avrRpt2 avirulence gene. Such pathogens 
include, but are not limited to, Pseudomonas syringae 
strains. 

5 The invention also includes any biologically 

active fragment or analog of an Rps2 polypeptide. By 
"biologically active" is meant possessing any in vivo 
activity which is characteristic of the Rps2 polypeptide 
[SEQ. ID NOS:2-5] shown in Fig. 2. A useful Rps2 

10 fragment or Rps2 analog is one which exhibits a 

biological activity in any biological assay for disease 
resistance gene product activity, for example, those 
assays described by Dong et al. (1991), supra ; Yu et al. 
(1993) supra ; and Kunkel et al. (1993) supra ; and Whalen 

15 et al. (1991) . In particular, a biologically active Rps2 
polypeptide fragment or analog is capable of providing 
substantial resistance to plant pathogens carrying the 
avrRpt2 avirulence gene. By substantial resistance is 
meant at least partial reduction in susceptibility to 

2 0 plant pathogens carrying the avrRpt2 gene. 

Preferred analogs include Rps2 polypeptides (or 
biologically active fragments thereof) whose sequences 
differ from the wild-type sequence only by conservative 
amino acid substitutions, for example, substitution of 

2 5 one amino acid for another with similar characteristics 

(e.g., valine for glycine, arginine for lysine, etc.) or 
by one or more non-conservative amino acid substitutions, 
deletions, or insertions which do not abolish the 
polypeptide's biological activity. 

3 0 Analogs can differ from naturally occurring Rps2 

polypeptide in amino acid sequence or can be modified in 
ways that do not involve sequence, or both. Analogs of 
the invention will generally exhibit at least 70%, 
preferably 80%, more preferably 90%, and most preferably 
3 5 95% or even 99%, homology with a segment of 2 0 amino acid 
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residues, preferably 40 amino acid residues, or more 
preferably the entire sequence of a naturally occurring 
Rps2 polypeptide sequence [SEQ. ID NOS:2-5]. 

Alterations in primary sequence include genetic 
5 variants, both natural and induced. Also included are 
analogs that include residues other than naturally 
occurring L-amino acids, e.g., D-amino acids or non- 
naturally occurring or synthetic amino acids, e.g. , p or 
Y amino acids. Also included in the invention are Rps2 

10 polypeptides modified by In vivo chemical derivatization 
of polypeptides, including acetylation, methylation, 
phosphorylation, carboxylation, or glycosylation. 

In addition to substantially full-length 
polypeptides, the invention also includes biologically 

15 active fragments of the polypeptides. As used herein, 
the term "fragment", as applied to a polypeptide, will 
ordinarily be at least 2 0 residues, more typically at 
least 40 residues, and preferably at least 60 residues in 
length. Fragments of Rps2 polypeptide can be generated 

2 0 by methods known to those skilled in the art. The 

ability of a candidate fragment to exhibit a biological 
activity of Rps2 can be assessed by those methods 
described herein. Also included in the invention are 
Rps2 polypeptides containing residues that are not 

25 required for biological activity of the peptide, e.g., 
those added by alternative mRNA splicing or alternative 
protein processing events. 

Other embodiments are within the following claims. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Ausubel, Frederick M. 

St:askawicz, Brian J. 
Brent, Andrew F. 
Dahlbeck, Douglas 
Katagiri, Fumiaki 
Kunkel, Barbara N. 
Mindrinos^ Michael N. 
Yu, Guo-Liang 

(ii) TITLE OF INVENTION: RPS2 GENE AND USES THEREOF 

(iii) NUMBER OF SEQUENCES: 106 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Fish & Richardson 

(B) STREET: 225 Franklin Street Suite 3100 

(C) CITY: Boston 

(D) STATE: MA 

(E) COUNTRY: USA 

(F) ZIP: 02110-2904 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: PatentIn Release #1.0, Version #1.30B 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/227,360 

(B) FILING DATE: 13-APR-1994 

( C ) CLASSIFICATION : 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Clark, Paul T. 

(B) REGISTRATION NUMBER: 30,162 

(C) REFERENCE /DOCKET NUMBER: 00786/230001 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (617) 542-5070 

(B) TELEFAX: <617)- 542-8906 

(C) TELEX: 100254 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2903 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
AAGTAAAAGA AAGAGCGAGA AATCATCGAA ATGGATTTCA TCTCATCTCT TATCGTTGGC 60 
TGTGCTCAGG TGTTGTGTGA ATCTATGAAT ATGGCGGAGA GAAGAGGACA TAAGACTGAT 120 



CTTAGACAAG CCATCACTGA TCTTGAAACA GCCATCGGTG ACTTGAAGGC CATACGTGAT 180 
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GACCTGACTT TACGGATCCA ACAAGACGGT CTAGAGGGAC GAAGCTGCTC AAATCGTGCC 240 

AGAGAGTGGC TTAGTGCGGT GCAAGTAACG GAGACTAAAA CAGCCCTACT TTTAGTGAGG 300 

- TTTAGGCGTC GGGAACAGAG GACGCGAATG AGGAGGAGAT ACCTCAGTTG TTTCGGTTGT 360 

GCCGACTACA AACTGTGCAA GAAGGTTTCT GCCATATTGA AGAGCATTGG TGAGCTGAGA 420 

GAACGCTCTG AAGCTATCAA AACAGATGGC GGGTCAATTC AAGTAACTTG TAG AG AG ATA 480 

CCCATCAAGT CCGTTGTCGG AAATACCACG ATGATGGAAC AGGTTTTGGA ATTTCTCAGT 540 

GAAGAAGAAG AAAGAGGAAT CATTGGTGTT TATGGACCTG GTGGGGTTGG GAAGACAACG 600 

TTAATGCAGA GCATTAACAA CGAGCTGATC ACAAAAGGAC ATCAGTATGA TGTACTGATT 660 

TGGGTTCAAA TGTCCAGAGA ATTCGGCGAG TGTACAATTC AGCAAGCCGT TGGAGCACGG 720 

TTGGGTTTAT CTTGGGACGA GAAGGAGACC GGCGAAAACA GAGCTTTGAA GATATACAGA 780 

GCTTTGAGAC AGAAACGTTT CTTGTTGTTG CTAGATGATG TCTGGGAAGA GATAGACTTG 840 

GAGAAAACTG GAGTTCCTCG ACCTGACAGG GAAAACAAAT GCAAGGTGAT GTTCACGACA 900 

* 

CGGTCTATAG CATTATGCAA CAATATGGGT GCGGAATACA AGTTGAGAGT GGAGTTTCTG 960 

GAGAAGAAAC ACGCGTGGGA GCTGTTCTGT AGTAAGGTAT GGAGAAAAGA TCTTTTAGAG 1020 

TCATCATCAA TTCGCCGGCT CGCGGAGATT ATAGTGAGTA AATGTGGAGG ATTGCCACTA 1080 

GCGTTGATCA CTTTAGGAGG AGCCATGGCT CATAGAGAGA CAGAAGAAGA GTGGATCCAT 1140 

GCTAGTGAAG TTCTGACTAG ATTTCCAGCA GAGATGAAGG GTATGAACTA TGTATTTGCC 1200 

CTTTTGAAAT TCAGCTACGA CAACCTCGAG AGTGATCTGC TTCGGTCTTG TTTCTTGTAC 1260 

TGCGCTTTAT TCCCAGAAGA ACATTCTATA GAGATCGAGC AGCTTGTTGA GTACTGGGTC 1320 

GGCGAAGGGT TTCTCACCAG CTCCCATGGC GTTAACACCA TTTACAAGGG ATATTTTCTC 1380 

ATTGGGGATC TGAAAGCGGC ATGTTTGTTG GAAACCGGAG ATGAGAAAAC ACAGGTGAAG 1440 

ATGCATAATG TGGTCAGAAG CTTTGCATTG TGGATGGCAT CTGAACAGGG GACTTATAAG 1500 

GAGCTGATCC TAGTTGAGCC TAGCATGGGA CATACTGAAG CTCCTTIAAGC AGAAAACTGG 1560 

CGACAAGCGT TGGTGATCTC ATTGTTAGAT AACAGAATCC AGACCTTGCC TGAAAAACTC 1620 

ATATGCCCGA AACTGACAAC ACTGATGCTC CAACAGAACA GCTCTTTGAA GAAGATTCCA 1680 

ACAGGGTTTT TCATGCATAT GCCTG1!TCTC AGAGTCTTGG ACTTGTCGTT CACAAGTATC 1740 

ACTGAGATTC CGTTGTCTAT CAAGTATTTG GTGGAGTTGT ATCATCTGTC TATGTCAGGA 1800 

ACAAAGATAA GTGTATTGCC ACAGGAGCTT GGGAATCTTA GAAAACTGAA GCATCTGGAC 1860 

CTACAAAGAA CTCAGTTTCT TCAGACGATC CCACGAGATG CCATATGTTG GCTGAGCAAG 1920 

CTCGAGGTTC TGAACTTGTA CTACAGTTAC GCCGGTTGGG AACTGCAGAG CTTTGGAGAA 1980 

GATGAAGCAG AAGAACTCGG ATTCGCTGAC TTGGAATACT TGGAAAACCT AACCACACTC 2040 

GGTATCACTG TTCTCTCATT GGAGACCCTA AAAACTCTCT TCGAGTTCGG TGCTTTGCAT 2100 
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AAACATATAC 


AGCATCTCCA 


CGTTGAAGAG 


TGCAATGAAC 


TCCTCTACTT 


CAATCTCCCA 


2160 


TCACTCACTA 


ACCATGGCAG 


GAACCTGAGA 


AGACTTAGCA 


TTAAAAGTTG 


CCATGACTTG 


2220 


GAGTACCTGG 


TCACACCCGC 


AGATTTTGAA 

• 


AATGATTGGC 


TTCCGAGTCT 


AGAGGTTCTG 


2280 


ACGTTACACA 


GCCTTCACAA 


CTTAACCAGA 


GTGTGGGGAA 


ATTCTGTAAG 


CCAAGATTGT 


2340 


CTGCGGAATA 


TCCGTTGCAT 


AAACATTTCA 


CACTGCAACA 


AGCTGAAGAA 


TGTCTCATGG 


2400 


GTTCAGAAAC 


TCCCAAAGCT 


AGAGGTGATT 


GAACTGTTCG 


ACTGCAGAGA 


GATAGAGGAA 


2460 


TTGATAAGCG 


AACACGAGAG 


TCCATCCGTC 


GAAGATCCAA 


CATTGTTCCC 


AAGCCTGAAG 


2520 


ACCTTGAGAA 


CTAGGGATCT 


GCCAGAACTA 


AACAGCATCC 


TCCCATCTCG 


ATTTTCATTC 


2580 


CAAAAAGTTG 


AAACATTAGT 


CATCACAAAT 


TGCCCCAGAG 


TTAAGAAACT 


GCCGTTTCAG 


2640 


GAGAGGAGGA 


CCCAGATGAA 


CTTGCCAACA 


GTTTATTGTG 


AGGAGAAATG 


GTGGAAAGCA 


2700 


CTGGAAAAAG 


ATCAACCAAA 


CGAAGAGCTT 


TGTTATTTAC 


CGCGCTTTGT 


TCCAAATTGA 


2760 


TATAAGAGCT 


AAGAGCACTC 


TGTACAAATA 


TGTCCATTCA 


TAAGTAGCAG 


GAAGCCAGGA 


2820 


AGGTTGTTCC 


AGTGAAGTCA 


TCAACTTTCC 


ACATAGCCAC 


AAAACTAGAG 


ATTATGTAAT 


2880 


CATAAAAACC 


AAACTATCCG 


CGA 








2903 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 885 amino acids 
<B) TYPE: amino acid 
<C) STRANDEDNESS : not relevant 
<D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Lys Lye Glu Arg Glu lie lie Glu Met Asp Phe lie Ser Ser Leu lie 
15 10 15 

Val Gly Cys Ala Gin Val Leu Cys Glu Ser Met Asn Met Ala Glu Arg 

20 25 30 

Arg Gly His Lys Thr Asp Leu Arg Gin Ala lie Thr Asp Leu Arg lie 
35 40 45 

Gin Gin Asp Gly Leu Glu Gly Arg Ser Cys Ser Asn Arg Ala Arg Glu 
50 55 60 

Trp Leu Ser Ala Val Gin Val Thr Glu Thr Lys Thr Ala Leu Leu Leu 
65 70 75 80 

Val Arg Phe Arg Arg Arg Glu Gin Arg Thr Arg Met Arg Arg Arg Tyr 

85 90 95 

Leu Ser Cys Phe Gly Cys Ala Asp Tyr Lys Leu Cys Lys Lys Val Ser 

100 105 110 
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Ala lie Leu Lys Ser lie Gly Glu Leu Arg Glu Arg Ser Glu Ala lie 
115 120 125 

Lys Thr Asp Gly Gly Ser lie Gin Val Thr Cys Arg Glu lie Pro lie 
130 135 140 

Lys Ser Val Val Gly Asn Thr Thr Met Met Glu Gin Val Leu Glu Phe 
145 150 155 160 

Leu Ser Glu Glu Glu Glu Arg Gly lie lie Gly Val Tyr Gly Pro Gly 

165 170 175 

Gly Val Gly Lys Thr Thr Leu Met Gin Ser lie Asn Asn Glu Leu lie 

180 185 190 

* 

Thr Lye Gly His Gin Tyr Asp Val Leu lie Trp Val Gin Met Ser Arg 
195 200 205 

Glu Phe Gly Glu Cys Thr lie Gin Gin Ala Val Gly Ala Arg Leu Gly 
210 215 220 

Leu Ser Trp Asp Glu Lys Glu Thr Gly Glu Asn Arg Ala Leu Lys lie 
225 230 235 240 

Tyr Arg Ala Leu Arg Gin Lys Arg Phe Leu Leu Leu Leu Asp Asp Val 

245 250 255 

Trp Glu Glu lie Asp Leu Glu Lys Thr Gly Val Pro Arg Pro Asp Arg 

260 265 270 

Glu Asn Lys Cys Lys Val Met Phe Thr Thr Arg Ser lie Ala Leu Cys 
275 280 285 

Asn Asn Met Gly Ala Glu Tyr Lys Leu Arg Val Glu Phe Leu Glu Lys 
290 295 300 

Lys His Ala Trp Glu Leu Phe Cys Ser Lys Val Trp Arg Lys Asp Leu 
305 310 315 320 

Leu Glu Ser Ser Ser lie Arg Arg Leu Ala Glu lie lie Val Ser Lys 

325 . 330 335 

Cys Gly Gly Leu Pro Leu Ala Leu lie Thr Leu Gly Gly Ala Met Ala 

340 345 350 

His Arg Glu Thr Glu Glu Glu Trp lie His Ala Ser Glu Val Leu Thr 
355 360 365 

Arg Phe Pro Ala Glu Met Lys Gly Met Asn Tyr Val Phe Ala Leu Leu 
370 375 380 

Lys Phe Ser Tyr Asp Asn Leu Glu Ser Asp Leu Leu Arg Ser Cys Phe 
385 390 395 400 

Leu Tyr Cys Ala Leu Phe Pro Glu Glu His Ser lie Glu lie Glu Gin 

405 410 415 

Leu Val Glu Tyr Trp Val Gly Glu Gly Phe Leu Thr Ser Ser His Gly 

420 425 430 

Val Asn Thr lie Tyr Lys Gly Tyr Phe Leu lie Gly Asp Leu Lys Ala 
435 440 445 
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Ala Cys Leu Leu Glu Thr Gly Asp Glu Lys Thr Gin Val Lys Met His 
450 455 460 

Aen Val Val Arg Ser Phe Ala Leu Trp Met Ala Ser Glu Gin Gly Thr 
465 470 475 480 

Tyr Lys Glu Leu He Leu Val Glu Pro Ser Met Gly His Thr Glu Ala 

485 490 495 

Pro Lys Ala Glu Asn Trp Arg Gin Ala Leu Val He Ser Leu Leu Asp 

500 505 510 

Asn Arg He Gin Thr Leu Pro Glu Lys Leu He Cys Pro Lys Leu Thr 
515 520 525 

Thr Leu Met Leu Gin Gin Asn Ser Ser Leu Lys Lys He Pro Thr Gly 
530 535 540 

Phe Phe Met His Met Pro Val Leu Arg Val Leu Asp Leu Ser Phe Thr 
545 550 555 560 

Ser He Thr Glu He Pro Leu Ser He Lys Tyr Leu Val Glu Leu Tyr 

565 . 570 575 

His Leu Ser Met Ser Gly Thr Lys He Ser Val Leu Pro Gin Glu Leu 

580 535 590 

Gly Asn Leu Arg Lys Leu Lys His Leu Asp Leu Gin Arg Thr Gin Phe 
595 600 605 

Leu Gin Thr He Pro Arg Asp Ala He Cys Trp Leu Ser Lys Leu Glu 
610 615 620 

Val Leu Asn Leu Tyr Tyr Ser Tyr Ala Gly Trp Glu Leu Gin Ser Phe 
625 630 635 640 

Gly Glu Asp Glu Ala Glu Glu Leu Gly Phe Ala Asp Leu Glu Tyr Leu 

645 650 655 

Glu Asn Leu Thr Thr Leu Gly He Thr Val Leu Ser Leu Glu Thr Leu 

660 665 670 

Lys Thr Leu Phe Glu Phe Gly Ala Leu His Lys His He Gin His Leu 
675 680 685 

His Val Glu Glu Cys Asn Glu Leu Leu Tyr Phe Asn Leu Pro Ser Leu 
690 695 700 

Thr Asn His Gly Arg Asn Leu Arg Arg Leu Ser He Lys Ser Cys His 
705 710 715 720 

Asp Leu Glu Tyr Leu Val Thr Pro Ala Asp Phe Glu Asn Asp Trp Leu 

725 730 735 

Pro Ser Leu Glu Val Leu Thr Leu His Ser Leu His Asn Leu Arg Cys 

740 745 750 

He Asn He Ser His Cys Asn Lys Leu Lys Asn Val Ser Trp Val Gin 
755 760 765 

Lys Leu Pro Lys Leu Glu Val He Glu Leu Phe Asp Cys Arg Glu He 
770 775 780 
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Glu Glu Leu He Ser Glu His Glu Ser Pro Ser Val Glu Asp Pro Thr 
785 790 795 800 

Leu Phe Pro Ser Leu Lys Thr Leu Arg Thr Arg Asp Leu Pro Glu Leu 

805 810 815 

Asn Ser He Leu Pro Ser Arg Phe Ser Phe Gin Lys Val Glu Thr Leu 

820 825 830 

Val He Thr Asn Cys Pro Arg Val Lys Lys Leu Pro Phe Gin Glu Arg 
835 840 845 

Arg Thr Gin Met Asn Leu Pro Thr Val Tyr Cys Glu Glu Lys Trp Trp 
850 855 860 

Lys Ala Leu Glu Lys Asp Gin Pro Asn Glu Glu Leu Cys Tyr Leu Pro 
865 870 875 880 

Arg Phe Val Pro Asn 

885 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Glu His Ser Val Gin He Cys Pro Phe He Ser Ser Arg Lys Pro Gly 
1 5 10 15 

Arg Leu Phe Gin 

20 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
Ser His Gin Leu Ser Thr 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Arg Leu Cys Asn His Lys Asn Gin Thr lie Arg 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Ser Lys Arg Lys Ser Glu Lys Ser Ser Lys Trp lie Ser Ser His Leu 
1 5 10 15 

Leu Ser Leu Ala Val Leu Arg Cys Cys Val Asn Leu 

20 25 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHT^ACTERISTICS : 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

lie Trp Arg Arg Glu Glu Asp lie Arg Leu lie Leu Asp Lys Pro Ser 
1 5 10 15 

« 

Leu lie Leu Lys Gin Pro Ser Val Thr 

20 25 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Arg Pro Tyr Val Met Thr 

1 5 * 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 8 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: 

Leu Tyr Gly Ser Asn Lys Thr Val 
1 5 

(2) INFORMATION FOR SEQ ID NO: 10: 

<i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Arg Asp Glu Ala Ala Gin lie Val Pro Glu Ser Gly Leu Val Arg Cys 
15 10 15 



Lys 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Arg Arg Leu Lys Gin Pro Tyr Phe 
1 5 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: -not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Gly Leu Gly Val Gly Asn Arg Gly Arg Glu 
15 10 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: cimino acid 
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(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Gly Gly Asp Thr Ser Val Val Ser Val Val Pro Thr Thr Asn Cys Ala 
1 5 10 15 

Arg Arg Phe Leu Pro Tyr 

20 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Arg Ala Leu Val Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Glu Aan Ala Leu Lys Leu Ser Lys Gin Met Ala Gly Gin Phe Lys 
1 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Leu Val Glu Arg Tyr Pro Ser Ser Pro Leu Ser Glu lie Pro Arg 

1 5 10 .15 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Trp Asn Arg Phe Trp Asn Phe Ser Val Lys Lys Lys Lys Glu Glu Ser 
15 10 15 

Leu Val Phe Met Asp Leu Val Gly Leu Gly Arg Gin Arg 

20 25 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Cys Arg Ala Leu Thr Thr Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Ser Gin Lys Asp lie Ser Met Met Tyr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION": SEQ ID NO:20: 

Phe Gly Phe Lys Cys Pro Glu Asn Ser Ala Ser Val Gin Phe Ser Lys 
1 5 10 15 

Pro Leu Glu His Gly Trp Val Tyr Leu Gly Thr Arg Arg Arg Pro Ala 

20 25 30 



Lys Thr Glu Leu 
35 
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(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Arg Tyr Thr Glu Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Asp Arg Asn Val Ser Cys Cys Cys 
1 5 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Met Met Ser Gly Lys Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERLSTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

Thr Trp Arg Lys Leu Glu Phe Leu Asp Leu Thr Gly Lys Thr Asn Ala 
15 10 15 



Arg 
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(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Cys Ser Arg His Gly Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 

His Tyr Ala Thr lie Trp Val Arg Asn Thr Ser 
15 10 

■ 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Glu Trp Ser Phe Trp Arg Arg Asn Thr Arg Gly Ser Cys Ser Val Val 
1 5 . 10 15 

Arg Tyr Gly Glu Lys lie Phe 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

Ser His His Gin Phe Ala Gly Ser Arg Arg Leu 
15 10 



20 
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(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 7 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Val Asn Val Glu Asp Cys His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION^ SEQ ID NO: 30: 

Glu Glu Pro Trp Leu lie Glu Arg Gin Lys Lys Ser Gly Ser Met Leu 
1 5 10 15 

Val Lys Phe 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 
<A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 'not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

Leu Asp Phe Gin Gin Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

. (C) STRANDEDNESS: not relevant 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Thr Met Tyr Leu Pro Phe 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

Asn Ser Ala Thr Thr Thr Ser Arg Val lie Cys Phe Gly Leu Val Ser 
15 10 15 

Cys Thr Ala Leu Tyr Ser Gin Lys Asn lie Leu 

20 25 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERTSTICS : 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Lys Arg His Val Cys Trp Lys Pro Glu Met Arg Lys His Arg 
15 10 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

Arg Ser Ser Ser Leu Leu Ser Thr Gly Ser Ala Lys Gly Phe Ser Pro 
1 5 . 10 15 

Ala Pro Met Ala Leu Thr Pro Phe Thr Arg Asp lie Phe Ser Leu Gly 

20 25 30 

He 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

Arg Cys He Met Trp Ser Glu Ala Leu His Cys Gly Trp His Leu Aan 
1 5 10 15 

Arg Gly Leu lie Arg Ser 

20 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

Leu Ser Leu Ala Trp Asp lie Leu Lys Leu Leu Lys Gin Lys Thr Gly 
1 5 10 15 

Asp Lys Arg Trp 

20 

(2) INFORMATION FOR SEQ ID NO: 38: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 15 amino acids 
- <B) TYPE: amino acid 

<C) STRANDEDNESS: not relevant 
( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

lie Thr Glu Ser Arg Pro Cys Leu Lys Asn Ser Tyr Ala Arg Asn 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

Cys Ser Asn Arg Thr Ala Leu 
1 5 

■ 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 
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<ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

Arg Arg Phe Gin Gin Gly Phe Ser Cys lie Cys Leu Phe Ser Glu Ser 
1 5 ■ 10 15 

Trp Thr Cys Arg Ser Gin Val Ser Leu Arg Phe Arg Cys Leu Ser Ser 

20 25 30 

lie Trp Trp Ser Cys lie lie Cys Leu Cys Gin 'Glu Gin Arg 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

Val Tyr Cys His Arg Ser Leu Gly lie Leu Glu Asn 
15 10 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

« 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Ser lie Trp Thr Tyr Lys Glu Leu Ser Phe Phe Arg Arg Ser His Glu 
15 10 15 

Met Pro Tyr Val Gly 

20 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

Ala Ser Ser Arg Phe 
1 5 

(2) INFORMATION FOR SEQ ID NO: 44: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

Thr Cys Thr Thr Val Thr Pro Val Gly Asn Cys Arg Ala Leu Glu Lys 
15 10 15 

Met Lys Gin Lys Aen Ser Asp Ser Leu Thr Trp Asn Thr Trp Lys Thr 

20 25 30 



(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:45: 

Pro His Ser Val Ser Leu Phe Ser His Trp Arg Pro 
15 10 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

Lys Leu Ser Ser Ser Ser Val Leu Cys lie Asn lie Tyr Ser lie Ser 
15 10 15 

Thr Leu Lys Ser Ala Met Asn Ser Ser Thr Ser He Ser His His Ser 

20 25 30 

Leu Thr Met Ala Gly Thr 
35 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

Glu Asp Leu Ala Leu Lys Val Ala Met Thr Trp Ser Thr Trp Ser His 
1 5 . 10 15 

Pro Gin lie Leu Lys Met lie Gly Phe Arg Val 

20 25 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

Arg Tyr Thr Ala Phe Thr Thr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: "not relevant 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

Pro Glu Cys Gly Glu lie Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

Phe Arg Asn Ser Gin Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 51: 

« 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

Ala Lys lie Val Cys Gly lie Ser Val Ala 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

Thr Phe His Thr Ala Thr Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) . SEQUENCE DESCRIPTION: SEQ ID NO:53: 

Leu Asn Cys Ser Thr Ala Glu Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 54: 

* 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Ala Asn Thr Arg Val His Pro Ser Lys lie Gin His Cys Ser Gin Ala 
15 10 15 



(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 
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Glu Leu Gly lie Cys Gin Asn 
1 5 

(2) information' FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56s 

Thr Ala Ser Ser His Leu Asp Phe His Ser Lys Lys Leu Lys His 
1,5 10 15 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:57: 

Ser Ser Gin lie Ala Pro Glu Leu Arg Asn Cys Arg Phe Arg Arg Gly 
1 5 10 15 

Gly Pro Arg 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

* 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

Thr Cys Gin Gin Phe lie Val Arg Arg Asn Gly Gly Lys His Trp Lys 
15 10 15 

Lys lie Asn Gin Thr Lys Ser Phe Val lie Tyr Arg Ala Leu Phe Gin 

20 25 30 

lie Asp lie Arg Ala Lys Ser Thr Leu Tyr Lys Tyr Val His Ser 
35 40 45 

4 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

Val Ala Gly Ser Gin Glu Gly Cys Ser Ser Glu Val lie Asn Phe Pro 
15 10 15 

His Ser His Lys Thr Arg Asp Tyr Val lie lie Lys Thr Lya Leu Ser 

20 25 30 

Ala 



(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

Val Lys Glu Arg Ala Arg Asn His Arg Asn Gly Phe His Leu lie Ser 
15 10 15 

Tyr Arg Trp Leu Cys Ser Gly Val Val 

20 25 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

lie Tyr Glu Tyr Gly Gly Glu Lys Arg Thr 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHTOU^CTERI ST I CS : 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: .not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

Leu Glu Gly His Thr 
1 5 



wo 95/28478 




PCT/US95/04570 



- 60 - 

(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

Pro Asp Phe Thr Asp Pro Thr Arg Arg Ser Arg Gly Thr Lys Leu Leu 
15 10 15 

Lys Ser Cys Gin Arg Val Ala 

20 

(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

Cys Gly Ala Ser Asn Gly Asp 
1 5 

(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

Asn Ser Pro Thr Phe Ser Glu Val 
1 5 

(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: cimino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

Ala Ser Gly Thr Glu Asp Ala Asn Glu Glu Glu lie Pro Gin Leu Phe 
15 10 15 
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Arg Leu Cys Arg Leu Gin Thr Val Gin Glu Gly Phe Cys His lie Glu 

20 25 30 

Glu His Trp 
35 

(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

Ala Glu Arg Thr Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION.: SEQ ID NO: 68: 

Ser Tyr Gin Asn Arg Trp Arg Val Asn Ser Ser Asn Leu 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:69: 

Arg Asp Thr His Gin Val Arg Cys Arg Lys Tyr His Asp Asp Gly Thr 
1 5 10 15 

Gly Phe Gly lie Ser Gin 

20 

(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

Arg Arg Arg Lys Arg Asn His Trp Cys Leu Trp Thr Trp Trp Gly Trp 
15 10 15 

Glu Asp Asn Val Asn Ala Glu His 

20 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:71: 

Gin Arg Ala Asp His Lys Arg Thr Ser Val 
15 10 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 55 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

Cys Thr Asp Leu Gly Ser Asn Val Gin Arg lie Arg Arg Val Tyr Asn 
15 10 15 

Ser Ala Ser Arg Trp Ser Thr Val Gly Phe lie Leu Gly Arg Glu Gly 

20 25 30 

Asp Arg Arg Lys Gin Ser Phe Glu Asp lie Gin Ser Phe Glu Thr Glu 
35 40 45 

Thr Phe Leu Val Val Ala Arg 
50 55 

(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

Cys Leu Gly Arg Asp Arg Leu Gly Glu Asn Trp Ser Ser Ser Thr 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 74: 
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<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

Arg Asp Arg Arg Arg Val Asp Pro Cys. 
1 5 

(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

Gin Gly Lys Gin Met Gin Gly Asp Val His Asp Thr Val Tyr Ser lie 
1 5 10 15 

Met Gin Gin Tyr Gly Cys Gly lie Gin Val Glu Ser Gly Val Ser Gly 

20 25 30 

Glu Glu Thr Arg Val Gly Ala Val Leu 
35 40 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

Gly Met Glu Lys Arg Ser Phe Arg Val lie lie Asn Ser Pro Ala Arg 
15 10 15 

Gly Asp Tyr Ser Glu 

20 

(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 
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Met Trp Arg lie Ala Thr Ser Val Asp His Phe Arg Arg Ser His Gly 
1 5 10 15 

Ser 



(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTIQN: SEQ ID NO: 78: 

lie Ser Ser Arg Asp Glu Gly Tyr Glu Leu Cys lie Cys Pro Phe Glu 
1 5 10 15 

lie Gin Leu Arg Gin Pro Arg Glu 

20 

(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

Ser Ala Ser Val Leu Phe Leu Val Leu Arg Phe lie Pro Arg Arg Thr 
1 5 10 15 

Phe Tyr Arg Asp Arg Ala Ala Cys 

20 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

Val Leu Gly Arg Arg Arg Val Ser His Gin Leu Pro Trp Arg 
15 10 

(2) INFORMATION FOR SEQ ID NO^Sl: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

His His Leu Gin Gly lie Phe Ser His Trp Gly Ser Glu Ser Gly Met 

1 . . 5 10 15 

Phe Val Gly Asn Arg Arg 

20 

(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 
<A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

Glu Asn Thr Gly Glu Asp Ala 
1 5 • 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

Lys Thr His Met Pro Glu Thr Asp Asn Thr Asp Ala Pro Thr Glu Gly 
15 10 15 

Leu Phe Glu Glu Asp Ser Asn Arg Val Phe His Ala Tyr Ala Cys Ser 

20 25 30 

Gin Ser Leu Gly Leu Val Val His Lys Tyr His 
35 40 

(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

Cys Gly Gin Lys Leu Cys lie Val Asp Gly lie 
1 5 10 
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(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

■ 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

Gly Ala Asp Pro Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

Ser Arg Lye Leu Ala Thr Ser Val Gly Asp Leu lie Val Arg 
15 10 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

Gin Asn Pro Asp Leu Ala 
15 

(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NOiSS: 

Asp Ser Val Val Tyr Gin Val Phe Gly Gly Val Val Ser Ser Val Tyr 
15 10 15 

Val Arg Asn Lys Asp Lya Cys lie Ala Thr Gly Ala Trp Glu Ser 

20 25 30 
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(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

Lys Thr Glu Ala Ser Gly Pro Thr Lys Asn Ser Val Ser Ser Asp Asp 
15 10 15 

Pro Thr Arg Cys His Met Leu Ala Glu Gin Ala Arg Gly Ser Glu Leu 

20 * 25 30 

Val Leu Gin Leu Arg Arg Leu Gly Thr Ala Glu Leu Trp Arg Arg 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

Ser Arg Arg Thr Arg lie Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 91: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

( D ) TOPOLOGY : 1 i near 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

Leu Gly lie Leu Gly Lys Pro Asn His Thr Arg Tyr His Cys Ser Leu 
1 5 10 15 

lie Gly Asp Pro Lys Asn Ser Leu Arg Val Arg Cys Phe Ala 

20 25 30 

(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 
Thr Tyr Thr Ala Ser Pro Arg 

(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

Thr Pro Leu Leu Gin Ser Pro lie Thr His 
15 10 

(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: I amino aoid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

Pro Trp Gin Glu Pro Glu Lye Thr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHTOUVCTERISTICS : 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

Leu Gly Val Pro Gly His Thr Arg Arg Phe 
15 10 

(2) INFORMATION- FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 58 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 
Leu Ala Ser Glu Ser Arg Gly Ser Asp Val Thr Gin Pro Ser Gin Leu 
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1 5 10 15 

Asn Gin Ser Val Gly Lys Phe Cys Lys Pro Arg Leu Ser Ala Glu Tyr 

20 25 30 

Pro Leu His Lys His Phe Thr Leu Gin Gin Ala Glu Glu Cys Leu Met 
35 . 40 45 

Gly Ser Glu Thr Pro Lys Ala Arg Gly Asp 
50 55 

(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

Thr Val Arg Leu Gin Arg Asp Arg Gly lie Asp Lys Arg Thr Arg Glu 
1 5 10 15 

Ser lie Arg Arg Arg Ser Asn lie Val Pro Lys Pro Glu Asp Leu Glu 

20 25 30 

Asn 



(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

Gly Ser Ala Arg Thr Lys Gin His Pro Pro lie Ser lie Phe lie Pro 
15 10 15 

Lys Ser 



(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 
Asn lie Ser His His Lys Leu Pro Gin Ser 
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1 



5 



10 



(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQTONCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

Glu Thr Ala Val Ser Gly Glu Glu Asp Pro Asp Glu Leu Ala Asn Ser 
1 5 * 10 15 

Leu Leu 



(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 
GGTAGTGAGT AGAGAATAAC 
(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102 : 

Glu Leu Arg Ala Leu Cys Thr Asn Met Ser lie His Lys 
15 10 

(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 

Gin Glu Ala Arg Lys Val Val Pro Val Lys Ser Ser Thr Phe His lie 
15 10 15 
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Ala Thr Lys Leu Glu He Met 

20 

(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 6 amino acids 
<B) TYPE: amino acid 
<C) STRANDEDNESS: not relevant 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

Lys Pro Asn Tyr Pro Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1491 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: .single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 



ATCGATTGAT 


CTCTGGCTCA 


GTGCGAGTAG 


TCCATTTGAG 


AGCAGTCGTA 


GCCCCGCGTG 


60 


GCGCATCATG 


GAGCTATTTG 


GAATTTTCGC 


AGGGTTATCG 


ATTCGTAGTG 


GGAACCCATT 


120 


CATTGTTTGG 


AACCACCAAC 


GGACGACTTA 


ACAAGCTCCC 


CGAGGTGCAT 


GATGAAAATT 


180 


GCTCCAGTTG 


CCATAAATCA 


CAGCCCGCTC 


AGCAGGGAGG 


TCCCGTCACA 


CGCGGCACCC 


240 


ACTCAGGCAA 


AGCAAACCAA 


CCTTCAATCT 


GAAGCTGGCG 


ATTTAGATGC 


AAGAAAAAGT 


300 


AGCGCTTCAA 


GCCCGGAAAC 


CCGCGCATTA 


CTCGCTACTA 


AGACAGTACT 


CGGGAGACAC 


360 


AAGATAGAGG 


TTCCGGCCTT 


TGGAGGGTGG 


TTCAAAAAGA 


AATCATCTAA 


GCACGAGACG 


420 


GGCGGTTCAA 


GTGCCAACGC 


AGATAGTTCG 


AGCGTGGCTT 


CCGATTCCAC 


CGAAAAACCT 


480 


TTGTTCCGTC 


TCACGCACGT 


TCCTTACGTA 


TCCCAAGGTA 


ATGAGCGAAT 


GGGATGTTGG 


540 


TATGCCTGCG 


CAAGAATGGT 


TGGCCATTCT 


GTCGAAGCTG 


GGCCTCGCCT 


AGGGCTGCCG 


600 


GAGCTCTATG 


AGGGAAGGGA 


GGCGCCAGCT 


GGGCTACAAG 


ATTTTTCAGA 


TGTAGAAAGG 


660 


TTTATTCACA 


ATGAAGGATT 


AACTCGGGTA 


GACCTTCCAG 


ACAATGAGAG 


ATTTACACAC 


720 


GAAGAGTTGG 


GTGCACTGTT 


GTATAAGCAC 


GGGCCGATTA 


TATTTGGGTG 


GAAAACTCCG 


780 


AATGACAGCT 


GGCACATGTC 


GGTCCTCACT 


GGTGTCGATA 


AAGAGACGTC 


GTCCATTACT 


840 


TTTCACGATC 


CCCGACAGGG 


GCCGGACCTA 


GCAATGCCGC 


TCGATTACTT 


TAATCAGCGA 


900 


TTGGCATGGC 


AGGTTCCACA 


CGCAATGCTC 


TACCGCTAAG 


TAGCAGGGTA 


TCTTCACGTG 


960 


GCGGCATCAT 


GACAAGCCCA 


TGATGCCGCC 


AGCAGCTACC 


TGAATGCCGT 


CTGGCTTTTT 


1020 
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GGTCCCTATT GTCGTATCCG GAAGATGACG TCAAAGAATC TCGGCAAGAG CTTTCTTGCT 1080 

CGACTCCTCA GCTTCCGGAT CGATCAGGTC GCTTGCCAGA GCGCGCTTGT CCATGAGCAT 1140 

CTGCCACAGC TGCTGGTCGA TGGTGTGCTC AGCTAAAGGG ATTTTGACGA CAACCATGCG 1200 

CAACTGCCCG TTGCGATACG CTCGATCCTG AAGCCCCGGT GTCCATGGCA GCCCCAAGAA 1260 

AAAGACATAG TTCGCCGCTG TGAGGTTGTA GCCTGTGCCG GCGGCCGACC TGGTCCCGAT 1320 

AAACACCCTG CAGTCCGGAT CCTGCTGGAA AGCATCAATC GCCTTCTGCC GCTTCTTGGG 1380 

CGAGTCACTG CCCACCAACG TCACGCACCC GACGCCAAGC TTGAGGCAGT GCTCCCGCAA 1440 

CGTGGCCACG GATTCCTGAT ACTCGCAGAA GAGGATCACC TTGTCGTCGA C 1491 
(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 55 amino acids 

(B) TYPE: amino acid 

(C) STR/mDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 

Met Lys lie Ala Pro Val Ala lie Asn His Ser Pro Leu Ser Arg Glu 
15 10 15 

Val Pro Ser His Ala Ala Pro Thr Gin Ala Lys Gin Thr Asn Leu Gin 

20-25 30 

Ser Glu Ala Gly Asp Leu Asp Ala Arg Lys Ser Ser Ala Ser Ser Pro 
35 40 45 

Glu Thr Arg Ala Leu Leu Ala Thr Lys Thr Val Leu Gly Arg His Lys 
50 55 60 

lie Glu Val Pro Ala Phe Gly Gly Trp Phe Lys Lys Lys Ser Ser Lys 
65 .70 75 80 

His Glu Thr Gly Gly Ser Ser Ala Asn Ala Asp Ser Ser Ser Val Ala 

85 90 95 

Ser Asp Ser Thr Glu Lys Pro Leu Phe Arg Leu Thr His Val Pro Tyr 

100 105 110 

Val Ser Gin Gly Asn Glu Arg Met Gly Cys Trp Tyr Ala Cys Ala Arg . 
115 120 125 

Met Val Gly His Ser Val Glu Ala Gly Pro Arg Leu Gly Leu Pro Glu 
130 135 140 

Leu Tyr Glu Gly Arg Glu Ala Pro Ala Gly Leu Gin Asp Phe Ser Asp 
145 150 155 160 

Val Glu Arg Phe lie His Asn Glu Gly Leu Thr Arg Val Asp Leu Pro 

165 170 175 

Asp Asn Glu Arg Phe Thr His Glu Glu Leu Gly Ala Leu Leu Tyr Lys 

180 185 190 
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His Gly Pro lie lie Phe Gly Trp Lys Thr Pro Asn Asp Ser Trp His 
195 200 205 

Met Ser Val Leu Thr Gly Val Asp Lys Glu Thr Ser Ser lie Thr Phe 
210 215 220 

Hie Asp Pro Arg Gin Gly Pro Asp Leu Ala Met Pro Leu Asp Tyr Phe 
225 230 235 240 

Asn Gin Arg Leu Ala Trp Gin Val Pro His Ala Met Leu Tyr Arg 

245 250 255 



What is claimed is: 
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1. Substantially pure DNA encoding an Rps 
polypeptide* 

2. The DNA of claim 1, wherein said DNA contains 
5 the RPS2 gene [SEQ, ID. NO:l], 

3. The DNA of claim 1, wherein said DNA is 
genomic DNA. 

4. The DNA of claim 1, wherein said DNA is cDNA. 

5. The DNA of claim 1, wherein said DNA is of a 
10 plant of the genus Arabidopsis. 

6. Substantially pure DNA having the sequence 
[SEQ. ID NO:l] of Fig. 2, or degenerate variants thereof, 
and encoding the amino acid sequence [SEQ. ID NOS:2-5] of 
open reading frame "a" of Fig. 2. 

15 7. Substantially pure DNA having about 50% or 

greater sequence identity to the DNA sequence [SEQ. ID. 
NO:l] of Fig. 2. 

8. The DNA of claim 1 or 2, wherein said DNA is 
operably linked to regulatory sequences for expression of 

2 0 said polypeptide; and 

wherein said regulatory sequences comprise a 
promoter . 

9. The DNA of claim 8, wherein said promoter is a 
constitutive promoter . 

25 10. The DNA of claim 8, wherein said promoter is 

inducible by one or more external agents. 

11. The DNA of claim 8, wherein said promoter is 
cell-type specific. 

12. A cell which contains the DNA of claim 1. 

* 

30 13. The cell of claim 12, said cell being a plant 

cell. 

14. The plant cell of claim 13, said plant cell 
being resistant to disease caused by a plant pathogen 
carrying an avirulence gene generating a signal 

3 5 recognized by an Rps polypeptide. 
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15. The plant cell of claim 14, said plant 
pathogen carrying an avrRpt2 gene, 

16. The plant cell of claim 14, said plant cell 
being from the group of plants comprising AraJbidopsis, 

5 tomato, soybean, bean, maize, wheat, and rice. 

17. The plant cell of claim 14, said plant 
pathogen being Pseudomonas syringae» 

18. The plant cell of claim 13, wherein said 
plant cell further contains an avrRpt2 gene operably 

10 linked to regulatory sequences; and 

wherein said regulatory sequences comprise a 

promoter • 

19. The plant cell of claim 18, wherein said 
promoter is a constitutive promoter. 

15 20. The plant cell of claim 18, wherein said 

promoter is inducible by one or more external agents. 

21. The plant cell of claim 18, wherein said 
promoter is cell-typ^ specific. 

22. A transgenic plant which contains the DNA of 
20 claim 1 integrated into the genome of said plant, wherein 

said DNA is expressed in said transgenic plant. 

23. A transgenic plant which contains the DNA of 
claim 8 integrated into the genome of said plant, wherein 
said DNA is expressed in said transgenic plant. 

25 24. A transgenic plant generated from the plant 

cell of claim 18 wherein said DNA and said av2rRpt2 gene 
are expressed in said transgenic plant. 

25. A seed from a transgenic plant of claim 22. 

26. A seed from a transgenic plant of claim 23. 
30 27. A seed from a transgenic plant of claim 24. 

28. A cell from a transgenic plant of claim 22. 

29. A cell from a transgenic plant of claim 23. 

30. A method, of providing resistance to a plant 
pathogen in a plant, said method comprising: 
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producing a transgenic plant cell comprising the 
DNA of claim 1 integrated into the genome of said 
transgenic plant cell and positioned for expression in 
said plant cell; and 
5 growing a transgenic plant from said plant cell 

wherein said DNA is expressed in said transgenic plant, 

31. A method of detecting a resistance gene in a 
plant cell, said method comprising: 

contacting the DNA of claim 1 or a portion thereof 
10 greater than about 18 nucleic acids in length with a 
preparation of genomic DNA from said plant cell under 
hybridization conditions providing detection of DNA 
sequences having about 50% or greater sequence identity 
to the sequence [SEQ. ID, N0:1] of Fig, 2. 
15 32. A method of producing an Rps2 polypeptide 

comprising: 

providing a cell transformed with. DNA encoding an 
Rps2 polypeptide positioned for expression in said cell; 
culturing said transformed cell under conditions for 
20 expressing said DNA; and 

isolating said Rps2 polypeptide. 

33. A method of providing, in a transgenic plant, 
resistance to a plant pathogen, said method comprising: 

producing a transgenic plant cell comprising the 
25 DNA of claim 8 integrated into the genome of said 

transgenic plant cell and positioned for expression in 
said plant cell; and 

growing said transgenic plant from said plant cell 
wherein said DNA is expressed in said transgenic plant. 
30 34. A method of providing, in a transgenic plant, 

resistance to a plant pathogen, said method comprising: 

growing said transgenic plant from the plant cell 
of claim 18 wherein said DNA and said avrRpt2 gene are 
expressed in said transgenic plant. 
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35. A method of isolating a disease resistance 
gene or portion thereof in plants having sequence 
identity to RPS2 , [SEQ. ID NO:l] said method comprising: 

amplifying by PGR said disease resistance gene or 
portion thereof using oligonucleotide primers wherein 
said primers 

(a) are each greater than 13 nucleotides in 

length ; 

(b) each have regions of complementarity to 
opposite DNA strands in a region of the nucleotide 
sequence [SEQ. ID N0:1] of Fig. 2; and 

(c) optionally contain sequences capable of 
producing restriction enzyme cut sites in the amplified 
product ; and 

IS isolating said disease resistance gene or portion 

thereof . 

36. A substantially pure Rps2 polypeptide. 

37. The polypeptide of claim 32, comprising an 
amino acid sequence substantially identical to an amino 

20 acid sequence [SEQ. ID NOS:2-5] shown in Fig. 2. 

38. A vector comprising the DNA of claim 1, said 
vector being capable of directing expression of the 
peptide encoded by said DNA in a vector-containing cell. 

39. A vector comprising the DNA of the avrRpt2 
25 [SEQ. ID NO:105] gene operably linked to regulatory 

sequences wherein said regulatory sequences comprise a 
promoter . 

40. A vector* comprising the DNA of claim 1 and 
the DNA of the avr'Rpt2 gene [SEQ. ID NO: 105] operably 

3 0 linked to regulatory sequences wherein said regulatory 
sequences comprise a promoter. 
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a 



c 



AXCTAAAXGXAACAGCGAGAAATCATCGAAATCGXTITCXTCTCATCTCrrXTC 

^ ZlZl'lIZll * * * ♦ 60 

TTCXTTTTCirrCTCCCTCTTTAGTAC^ 



K«KKSREI IE DFISSLIV C 

b SKRXSEKSSKW ISSHLLSLX- 

c VKERARNHRNGrKLISYRWL- 



61 



TGTCCTCACGTGTTGTCTCAATCTATGXXTATGGCGCXCXCJUVGXGGACATXAGAC^ 



XCXCCACTCCACAXCACXCTTXGXTXCTTATACCCCCTCTCTTCTCCIGTATTCTGACTX' 



-►120 



CAQVLCESKNMXERRCHKTD 
b VLRCCVNL*IWRRKED1RLI- 
C CSCVV«lyEVOGEKRT*D*S. - 



121 



CTTAGACXACCCATCACTGATCTTGAAACAGCaiTCCGTaACriCAAG^ 

"""T""*" *--f--- — ♦ 

GAATCTGTTCCCTAGTGXCrXCXXCTTTGTCCCTXGCCACTCXXCTTC 



IBO 



* LRQAITDLETXICDLKAIRD 

b LDKPSLILKQPSVT*RPY VM- 

e •TSHH *S»NSHR*LEC;HT**- 

CXCCTCACTITACCGATCCXACXXCXCCGTCTAGAGGGACGJUWSCTG^^ 
181 ♦ ^ ^ ^ ^ 240 

CTTCACTGAAJVTCCCTAGGTItSTTCTCCCAGATCTCCCTCCTTCCACCJ^T^ 

a OLTLRIQQDGLEGRSCSNRA 

b T»LVCSNKTV-RDEAAQIVP. 

PDFTDPTRRSRGTKtiLKSCQ- 



AGACACTOGCrrACTCCGGTGCXXCTAJUZGGXGACTAAAACJ^CCCCTACTTTTACT^ 

241 - * — — — — 300 

TCTCTCACCGAATCACCCCACCTTCATTGCCTCTCATTTTOTCCCCA 

a REWLSAVQVTETKTALLLVR .- 

b ESCLVRCK»RRLKQPYr**C- 
c RVA*CCASNGD»NSP'rFSEV- 

TTTAjGGCGTCCCCAACACACCACOCCAATGACCACGAGATACCTCAG'iTCTTTCGGTTC 

301 — ^ * * — 360 

AXXTCCCCAGCCCTTGTCTCCTGCCCTTACTCCTCCTCTATGCAGTC.U^CAAXCCC 

a FRRREQRTRMRRRYLSCFGC - 

b LCVGNRGRE'CGDTSVVSVV- 

c •asctedaneeeipq :„frlc;- 

GCCGACTACAAACTGTGCAAaAAGGTrTCTCCCATATTCAACAGCATrx^TCAC^^ 



FIGURE 2 
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3€1 ^ ^ ♦ * ♦ * 420 

CGGCTCJl lU * lTlU ACJkCC lnK: * r lX:CAX^SACGGTXTXACTTCTC 

« XDYKLCKXVSXILKSI13ELR 

b PTTNCXRRrLPY»RALVS*E- 

c RLQTVQECrCHIESHW»AE R - 

CAACGCTCTCJU^CCTATCAAAACAOATOCCMCTCAX^^ 

421 * — ♦ -480 

CTTCCGACLACTTCCXTAGTTTTCTCTACCTCCCACTTAAGTTCAT^ 

& ERSEAIKTDGCSIQVTCREI 

b NALKLSKQMAGQFK*LVERY- 

C TL*SYQNRWRVKSSNL *RDT- 

CCCATCAAGTCCGTTGTCGGAAATACCACGATCATCGAACAGCTTrrcXjAATT^ 
481 4-- + 540 

ccctacttcacgcaacagcctttatogtgctactaccttgtccaaaacctta;^^ 

& piksvvcnttmmeqvlefls 

b pssp1-seipr*wnrfwnfsv- 

c hqvrcrkyhddgtgfcisq*- 

g aac aagaagjuuu3aggaatcattggtctttatggacctcctggo 

S41 ---•-4. -1. — ^ * ♦ 600 

Cl ' l ' LTlX^ riXJ l ' in ^l^- C rrAGTAACCACAAATACCTCGACCACCCCAACCCTTCTCTTGC 

a EEEERGIIGVYG PCGVGKTT - 

b KKKKEESLVFMDLVGLGRQR- 
c RR RKRNHWCLWTWWCWEDNV- 

TTAATCCAGAGCATTAACAACGACCTGATCACAAAACGACATCACTATCATCTACTGAT^ 
601 .-^ ♦ + ♦ * — ♦ 660 

AATTACCTCTCGTAA l ' IUTnix CTCaACTAGTC lT n ^J CTGTAGTCATJLCTACATOACTAA 

a LM -QSINNELITKGHQY DVLI 

b •CRALTT S*SQKDISMMY*F- 

c NAEK*QRADHKRTSV ^ CTDL- 

TCCGTrcAAATCTCCAGAGAATTCGGCGACTCTACAATTCAGCAACCCGTT^ 

661 * »■ + ♦ ♦ ♦ 720 

ACCCAAGTTTACAGGTCTCTTAAGCCCCTCACATCTXAAGTCGTrCCCCAACCTCCTGCC 

a WVQMSREFGECTIQOAVCAR 

b CFKCPENSASVQFSKPLEHG- 

C GSNVQRIRRVYNSAS UWSTV- 

TTGGGTTTATCTTCGCACCAaAAGGACACCCGCGAAAACAGACCTTTC;AAGATATACAGA 

721 ♦ * * 780 

AACCCAAATACAACCCTGCTCTTCCTCTCGCCCCTrrTGTCTCCAAAcrrrCTATATGT^ 

& LGLSWDEKETGENRALKIYR 

b WVYLGTRRRPAKTEL'RYTE- 

C GFILG REGDRRKQSPEDIQ5- 

CCTTTGAGACAGAAACG l ' i nvr : x;i'I CT ' :O CTAGATCATCTCTGGGX\GAGATAGACT^ 

781 . ^ ♦ + 4 840 

CG AAACT^TOTCTTICC AAAGAAC AAC AACG ATCTACTAC AC ACCCT rCTCT ATCTGAAC 

a ALRQJCRFLLLLDDVWEEIDL 

b L»DRKVSCCC»MM SGKR*TW- 

c fetetflvvar*clg:=idr lc.- 

FIG. 2 CONTINUED 
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GAC>JU^ACTCaACTTCCTC G ACCTGACAGGGJ\J^ 
841 + + + -I- ♦ 900 

A E KTCVPRPDRENKCKVMFT T 

b RKL2FLDLTCKTNAR* CS RH - 

c ENWfiSST*OCKQMgGD VHDT- 

CCCTCTATAGCATTATCCAACAATATCKWTCCCXSAATACAACT^ 
901 + ♦ — > — - ^ 4. 960 

GCCJ^ATATCGTAATACGTTCTTATACCCACGCCTTATCrrCAXCTCTCAC^ [ 

& RSIXLCKNMGXEYKLRVEPL 

b GL'HYATIWVRNTS^EWSFW:. 

c VVS IM QCVGCCIQVES. CVSC- 

GAGAAG^JO^CACGCGTCCGACCTGTTCTGTAGTAACCTATCGAGAAJ^^ 

961 ^ + + — — '►1020 

CTCTTCrriCT C CGCACCCTCGACAAGACATCATTCCATACCTC'rri'rL I'AGAAAATCTC 



a 



EK". HAWELFCSKVWRXDLLE 
b RF'^:TRGS CSVVRyCEK lF*r- 

c 2STRVGAVL**CMEKRSFRV- 

TCATCATCAATTCCCCGGCTCCCCGAGATTATAGTGAGTAAATGTGGACaATT^ 

1021 ------- — ♦ * — * — ♦ 1080 

ACTACTAGTTAAOCGGCCGACCGCCTCTAATATCACTCATTTACACCTCCTAACCGTCAT 



SSSIRRLAEIIVSKCCGLPL 



b HHQFAGSRRL**VNVEDCH*- 
c IINSPARGDYSE*MWRIAT S - 

CCCTTCATCACTTTAGGAGCJUGCCATCCCTCATAGAGACACAGAACAAGAGT^ 

lOSl — ♦ — — ♦ ^ ♦ -(-llAO 

CCCAACTACTCAAATCCTCCTCCGTACCCAGTATCTCTCTGTCTTCTTCTCACCTAGGTA 

a ALITLGGAMAHRETEEE WIH 

b R»SL*EEPWLIERCKKSGSM - 

c VDHFRRSHGS*ROR RF'. VDPC- 

GCTAGTCAACTTCTGACTACATTTCCAGCAGAGATGAAGGGTATCAACTATCTATTTGCC 

1141 - ^ + + 1200 

CGATCACTTCAACACTGATCTAAAMTCGTCTCTACTTCCCATACTTCATACATAAACGC 

a ASEVLTRFPAEM KGMNYVFA 

b LVKF 'LDFQOR^RV'TMYLP- 

c ♦*SSD*ISSRDEG YEI. CICP- 

CTTTTGAAATTCAGCTACGACAACCTCGAGACTGATCTGCriCCaTCTTCTT^ 

12C1 * ^ + * * 1260 

GAAX\CTTTAACTCGATGCTCrTGGAGCTCTCACTAGACGAAGCCACAACAAAGAA^ 

a LLK FSYDNLESDLLR. SCFLY 

b 7*HSATTTSRV ICFGL VSCT - 

c FEIQLRgPRE'SASVLFLVL - 

TGCGCTTTATTCCCAGAAGAACATTCTATAGAGATCCAGCAGCri^TrcACTACTGGaT^ 

-j^-c-^ ^ * ^ ^ + ♦ 1320 

ACGCGAAATAACGGTCTTCTTGTW^CATATCTCTAGCTCCTCCAACAACTCATC 

a CALFPEEHSIEIEQLVEYWV - 

FIG. 2 CONTINUED 
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b XLYSQKNIL*RSSSLLSTaS- 
c RFIPRRTFYRDRAAC^VLGR- 

OCCCAACCC ill\,UV ACCACCTCCCATGCCCTrJUkCACCXTTTXCJUU^ 
1321 * ♦ * ^ ^. 1380 

CCG Cri\: CC:JUU^XGTGGTCGAGGGTACCCX:A A T r aiX X ;TAAATI^^ 

a GEGFLTSSHOVNTIYKCYFL 

to XKGFSPXPMXLTPFTRDIFS- 

c RRVSHOLPWR'HHLQCIFSH- 

ATTCKKWATCTGAAXGCGGCXTGTTTGTTCGAAACCCCACATC 
1381 ♦ ♦ ♦ 4. + 1440 

TXXCCCCTXGAC rint; GCCGTACXXACXACC UU ' It; GCCTCTAClCTinTOTGTCCACTTC 

& ICDLKXXCLLETCDEKTQVK 

b L.Cr*KRHVCWKPEMRKHR*R- 

c WCSE SGMrvCN RR*ENTCED> 

ATOCXTAATGTCGTCAGAAGCTTTCCATTCTGGXTGGCXTCTCAACAGGGGACCT 

1441 + * ♦ ♦ ♦ ISOO 

TACCTATTACArr.;wrCTrCC;-;j.CCTAXCACCTACCCTACArTTGTCCCCTGXATATTC ' 

a MHNVVRSFXLWMXSEQCTYK 

b CIMWSEALHCCWHLNRCLIR- 

c X *CCOKLCIVDCI*TCDL'C- 

aAGCTGXTCCTXGTTGXGCCTAGCXTGGGACXTXCTGXXGCTCCTXXXGCXGXXXXC^^ 

1501 ^ * * + ♦ 1550 

CTCGACTACCATCAA C TCCCATCCTACCCTCTATCACTTCCACCATTTCGTC U ' l ' n 

a ELILVE PSMG HTEAPRXENW 

b S-S*LSL, XWOILKLLKQKTC - 

c ADPS*A*HCTY*SS«SRKLA- 

CGACAACCGTTOGTGATCTCATTGTTAGATAACAGAATCCAGACCTTtKrCTC 

1561 + ♦ ♦ * ♦ ^ 1620 

GCTGTTCCCAACCACTAGACTAACAATCTATTGTCTTAGCTCTGaAACCGACT'TT^ 

a RQXLVISLLDNRIQTLPEKL 

b DKRW«S HC*1TESRPCLKNS- 

c TSVCDLIVR»ONPDLA*KTH- 

ATATCCCCGAAACTGACAACACTGATGCTCCAACAGAACAOCTCTTTGAAGAAGATTCCA 

1621 f ^ + + + + 1680 

TXTXCCGiSCTTICXCTGTTCTGACTACCACGriCTCTTCTCCACAAACTTCT^ 

a I CPKLTTLMLQQNSSLKKIP 

b YARK*OH*CSNRTAL*RR FQ - 

c. MPETDNTDAPTEQLFEEDSN- 

ACACCGTTTTTCATCCATATX^CTCTTCTCAGAGTCTTGGAC^^ 

1681 ♦ * ♦ * ♦ 1740 

TGTCCCAAAAAGTACGTATACGGACAAGAGTCTCAGAACCTGAACAGCAAGTGTTCATAG 

a TGFFM HMPVLRVL DLSFTSI 

b Q GFSCICLFSESWTCRSQVS- 

c RVFHAYX CSQSLCLVVHKYH- 

XCTCXGATTCCGTTGTCTXTCXAGTXTTTGGTGGAGTTCTATCATCTCrc 

1741 + — ^ * ♦ 1800 

TC ACTCTAAGCC AAC AG ATACTTC ATAAXCC AC CTC AAC AT AGTAG AC AG AT AC AGTCCT 

FIG. 2 CONTINUED 
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& TSIPLSIKYLVELYHLSKSG 

b LRPRCLSSIWWSCIICLCQE- 

C *DSVVVQVPGaVVSSV:yVRN- 

ACXW^jaATXXGTGTATTGCCACACCACCTTCGCXATCIT'*^^ 

1801 — + — .-^^ --J-f- f -:1860 

TCrrTTCTXTTCACXTAACCCTCTCCTCCAACCCriTXC^ 

a TKISVLPQELGNLRK LKHLD 

b QR*VYCHRSLGI LEN*.SIWT- 

c KDICCIXTG .AWES^KTEXSGP- 

CTACAXACXACTCAGTrTCTTCAGACGATCCCACGACATCCCATATGTT^ 

1861 + * + ^^--.+ — +1920 

GATG ll ' lVlTO AGTCAAAGAAGTCTGCTAGGGTGCTCTACGGTATACAACCGACTC 

a LQRTQFLQTIPRDAICWLSK 

b YKELSFFRRSHEMPYV G*AS- 

c TKNS VSSDDPTRCHMLAEQA- 

CTCGAGGTTCTCAACTTCTACTACAGTTACCCCGGTTCGCAACTCCAciAG 

1^21 ♦ ^ ♦ 1980 

GACCrCCA;^ACTTCAACATGATGTCAATGCGGCCAACCCTTCACGTC^ 

A LEVLKLYYSYACWELQ^rCE 

b SRF'TCTTVTPVGNCR^ALEK- 

C RGSELVLQLRRLCTAELW RR;- 

GATGAAGCACJU^GAACTCCaATTCGCT^lJ^CTTGGAATACTTGGAA^ 

1981 -^ 4- ♦ * ZZZZT"*' * 2040 

CTACTTCGTCTTCTTCACCCTAAGCCACTGAACCTTATGAACCTTTTTC 

a DEAEELOFADLEYLE NLTTt- 

b MKQKNSDSLTWNTWKT *PHE- 

c *SRRTRIR*LGILG RrNHTR- 

GGTATCACTGTTCTCTCATTCGAGACCCTAAAAACTCTCTTCGAGTTC 

2041 *" --'^ — — — --^ 2100 

CCATACTCACAAGAGACTAACCTCTCGOATTITTaAaAGAAGCTCAA^ 

a GITVLSLETLKTLFEFGALH 

b VSLFSHWRP»KLSSSSVLCI- 

c YHCSLICOPKNSLRVRCFA*- 

AAACATATACACCATCTCCACGTTGAAGAGTCCAATGAACTCCTCTACTTC 

2101 -f- * f + * 2160 

TTTCTATATCTCGTXGAGCTCCAACTTCTCACGTrACTTGAGGACATGAAGTTA^ 

a K HIQHLHVEECNELLYFKLP 

b KIYSISTLXSAMNSST. SISH- 

c tyTASPR'RVQ»TPL7-,QSP:. - 

TCACTCACTAACCATGGCACCAACCTGAGAACACTTAGCATTAAAAG'nX^CATGACT^ 

2161 * — * - ^ 2220 

ACTCAGTGATTCGTACCGTCCTTGGACTCTTCTGAATCGTAATTTTCAACGGTACTCA^ 

a SLTNHGRNLRRLSIKS CHDL 

b HSLTMACT'EDLALKV:AHTW- 

c th* pwqepekt'H* k:-p*lc - 

GAGTACCTCGTCACACCCGCAGATTTTCAAAATCATTGGCTTCCCAGTCTAGACCTTC 
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2221 <- ^ + + + + 2280 

CTCATt»ACCAGTGTGGGCCriCTAAAACTTTTACTA^^ ' 

a SYLVTPXDFrNDWLPS LKVL 

b STWSHPQILXMIGFRV^RP*- 

ACCTTACACAGCCTTCACAACTTAACCACACTOrCGGCAAAT^ 
2281 ♦ ^ 4. 2340 

TCCAXTCTGTCOGAACI U l'ICXXTTTCTCTCACACCC Crin 'XAGACATO 

a TLHSLHMLTRVWGNSV SQOC 

b RVTA FT T*PECGEIL* AKIV- 

C VTQPSQ LNQSVGKFCKPRLS- 

CTCKXMAATATCCGTTTCATAAACATTTCACACTGCAACAACCTC 

2341 ^ — ♦ 4- : ♦ -"^ + 2400 

GACCCCTTATACXXrJUlCGTA U ' inU TAAAGTGTGACG a^Ul ' i rGACU'ICT'rACAGAG 

a L RNIRCINISH CNKLK-NVSW 

b CCISVA«TFKTATS*RMSHG- 

c AEYPLHKHFT LQQAEECLMG- 

CTTCAGAAACTCCCAAJ^CTACAGCTGATTCAAC 1UV1 CGACT:^AGAGAGATAGAGGAA 

2401 * + + + + > — + 2460 

CAAGTCTrrGACCGTTTCCATCTCCACTAACTTGACAACCTGACCTCICTCTATCTCCTT 

a VQKLPKLEVIELFDCR EIEE 

b rRNSQS*R*LNC'STAKR*RN- 

C SETPKARGD*TVRLQKDRGI- 

TTCATAAGCGAACACGAGAGTCCATCCGTCGAAGATCCAACATTCITCCCAAGCCTCAAG 

2461 + + + + + + 2520 

AACTATTCGCTTCTGCTCTCAGGTAGGCACCTTCTACCTTGTAACAAGGGTTCGG 

a LISEHESPSV EDPTLFPSLK 

b ••ANTRVHPSKIQHCSQA*R- 

C DKRTRESIRRRSNIVP KPED- 

ACCTTGJUSAACTACCGATCTCCCACAACTAAACACCATCCTCCCATCTCCA rrri CATTC 

2521 + ♦ + + + ' + 2580 

TCGAACTCTTGATCCCTACACCCTCTTGATTTCTCGTAGCACCCTAGAK 

a TLRTRDL PELKSILPSRFSF 

b P*ELGICQN*TASSHLDFHS- 

c LE N*aSARTKCHPPIS IFIP- 

CAJUUU^GTTGAAACATTAGTCATCACAAATTGCCCCAGAGTTAAGAAACTGCCGT^ ' 

2581 ^ * ^ + ^ 2540 

G T^ '7" I ' atJ AA C rrT G TAATCACTAGTCTTTAACCCGCTCTCAA' ri^,rinu ACCGC>^ 

a 5KVETLVITNCPRVKKL .PFQ 

b KKLKH*SSQI APELRNCRFR- 

c KS • M ISHHKX-PQS • ETAVS C- 

GAGACGAGOACCCAGATGAArTTGCCAACAGTlTATTGTGAGGAGAAATGGTTC 

2641 — * i-- * + + 2700 

ClCTCCTCCTGGCTCTACrrCAACCGTTGT^AAATAACACTCCTCTTTACCACCT^ 

a ERRTQMNLPTVYCEEKWWKA 

b. RGGPR'TCQQriVRRNCCKH- 

c EZDPDELAKSLL^CEMVEST- 
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37*1 — ^ ^ 



«. \ \ \ \ \ % \ ^ V \ \ \ % % - 





9. K 

2821 —4..-^ -'-"^IITirrrtl tt^" ^*^^'*°^'=*'=****CTACAaXTrATCTMT 

V V , V X . . "t % '» \ % % \ «j », V 

aaai — -.—iT—- 2302 ^^^^ ^^'^^ 

OTATTTlTC»rrTaXTXOaccCT 

* HXHQTIE • (SEQ ID NO: 2-5) 

• J J^TXLSA - (SEQ ID NO: 6-59) 

X ? K Y P R . (SEQ ID NO: 60-lOA) 

NQMX 

ln«yTn«« that do not euti 
XpnZ 
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CArT^TTKAACCACCAACCCACCACTTAACAACCTrcCCSiaa^^ , S 



CCTCCAC 



« 



xrroiyrvAiSartlnClyAsnCluArgMecClyCysTrQ 



TTCTTCCCTCTCACCCACCrrcCT 

lVrV4lS«rGlnClyAsnCluArgMecClyCysTrp 

TATC^CTSCCCAACAATOrrrcCCCATT^^ ' 

TyrAiACyaAlaArgMecVdlGlyHisS-rVAlcVrrTTT^^^^^^^ ^^5 

iv-iyMiss«rV4lGluAlACIyProArgLouGlyL«uPro 



uiyArgCluAlaProAiaCiyL«uGinAspPheS«rA,pVaic:uArg 
wiyi,,uThrArgVal«spLeuProAsoAsnOluArgPheThrHis 



Fig. 3 
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GxxcA iir T CCi^ :u :x L i^^ i re 7 ataaccaccsccccattxtx itiuj ctccaaaactccc & j s 

Cl.uCluL«uCiyAlAL.«uLAuTVrty«HiaClyProX i«I l«Ph«GlyTTpLys'mrPro 



AATCACAC Xmu CACA' ll/ r t,t:>UH.L: T' C A C IUC r UH.U ATAAACXCAC mt,U ICC XTTACT 695 
AsnAApS* rTrpH i aM«cS«r v« ll.«uThrC ly v« i AspLy sCiuTnrS«rS«r £ 1 mTtur 



rriV ACCATCCCCCACACCCCCCOCACCTACCAATSCCCCTCCATTA U; 1 l AATCACCCA 755 
PtittHi9AspProAxgrClnClYProAspL«uAl*H«c ProL«uA4pTyrPh«A«nGlnAX9 



riSlA^ ATCCCAC u l ' i\. CACACCCAJlTC C ' lVI ACCCCTAACTACCACCCTA lVi ^ l^. ACCTC 815 
L«uAiATrpGlnV«lProHi»AlAH«ct-«uTVrAxg£nd (SEQ ID NO: 106) 

<»OCCATCATCACAJkCCCCATCATCCCCCCACCACCTACCTC:AATCCUt*iU L^M. i'i i 1 1 87 5 



OC T CCC T AriVlX-l/r AT C C O JAA g ATCACCTCAAACAA lVIT-g^ XACA P, I ' i ' lL 1 iUC f 93 5 

ccx c r\: Acc :': x.lu^ tccatcacctcccttcccacacccc o, i h^ catcaccat 995 

. . . . • • 

CTCCCACAC C IV^C iOU lUU A iVA^l^^ iVC lT, ACCTAAAOCGA I I ' lHl^ ACCACAACCATCCS 1055 

CXACTCCCCCTTCCCATACCCTCCATCCTCAAjCCCCCCU il^CATCCCACCCCCAAOAA 1X15 

* « • • * _ * 

AXACACATA u I i^.u ccc c rc Aj L^ rru r Acc c rt: iu ccccccccccac l iv^^ t c cccat 1175 

* ...» • 

AAACACCCTCCACrCC G CA lT,L : i:>C iCU AAACCATCAATCCC L ; iV. i ' ^ LUC:'!'-. i'wOC 1235 

* 

CGACTCACTCCCCACCAACCTCACCCACCCCACCCCAACL 1 l^^XCGCXL,l\J~ iCLCOCAA 1295 

* « * 4 « 

C C ' rCC CCACCGA i ' lT.L iV^ ATACTCCCACAJkCACCATCAC L ri * w IVO I^ GAC 1346 

(SEQ ID NO: 105) 

Fig. 3 (continued) 
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